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Preface 

This presentation of time series analysis techniques has been devel- 
oped by the author in the process of teaching (since 1971) a graduate 
level course on the subject to scientists, engineers, and computer an- 
alysts at NASA Langley Research Center. The intent is to develop, 
from the beginning, the basic understanding necessary to properly ap- 
ply modern spectral analysis techniques. The subject rests on a firm 
foundation in the theory of probability, which will be reviewed in this 
monograph. Thus, the only prerequisites are an ordinary engineering 
knowledge of calculus and some acquaintance with linear system theory. 
However, familiarity with random process theory, as provided in Prob- 
ability , Random Variablts y and Stochastic Processes by Papoulis, and 
with Fourier analysis techniques, as provided in The Fourier Transform 
and Its Applications by Bracewell, would be helpful. 

Although there are many textbooks on time series analysis, several of 
which the author has used in his courses, this monograph takes a differ- 
ent approach from most. First, the theory in this presentation has been 
developed, insofar as possible, for continuous data. This postpones the 
inevitable use of discrete mathematics, which the author believes tends 
to obscure physical understanding, until after the reader has gained 
some familiarity with the concepts. Only then are the computational 
details for digital data introduced. Second, the author assumes that 
most readers will have access to either standard computer software or 
hard-wired spectral analyzers to do the work of computation. One big 
danger of such standard analysis techniques, however, is that they will 
always yield an output, even if the input does not satisfy the assump- 
tions on which the analysis is based. Thus, this monograph seeks to 
provide the theoretical overview necessary to correctly apply the full 
range of these powerful techniques. Finally, time series analysis is a 
vast and rapidly changing field. In an attempt to remain complete 
and current, the last chapter introduces the reader to many specialized 
techniques and areas where research is presently in progress. 

The author would like to express his appreciation to William E. 
Zorumski and Stephen K. Park, who worked almost as hard in reviewing 
this manuscript as the author did in writing it. 


Jay C. Hardin 

NASA Langley Research Center 

Hampton, VA 23665-5225 PRECEDING PAGE BLANK NOT FUMED 

July 16, 1985 
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Chapter I 
Introduction 

Consider a record of length T of a real function f(t) as shown in 
figure 1. By convention, the independent variable is called “time.” 
although it need not actually be time. Instead, the function may depend 
on distance or angle or any other variable of interest. The data record 
shown is of finite length, since that is all that is ever available in the real 
world, and need not be continuous but may, in fact, consist of digital 
data taken at a set of discrete times. This monograph will be concerned 
with the development, interpretation, and use of various techniques to 
extract information from such a record. 



1.1 Why Harmonic Analysis? 

Many time series analysis techniques involve harmonic analysis, that 
is. decomposition of the record f(t) into a collection of sines and cosines 
of various frequencies. Before considering these techniques, it is relevant 
to ask why it would ever be necessary or advantageous to represent a 
function by harmonic functions. Certainly many records of interest look 
very different from the well-behaved periodic sine and cosine functions. 
There are at least three answers to this question. 

Simple input/output relations for linear systems . Consider a signal 
x(t) that is passed through a linear, shift-invariant physical system to 
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Introduction to Time Senes Analysis 



Figure 2. Schematic of linear system. 


produce an output signal y(£) as shown in figure 2. Although the input 
and output are related by a convolution integral in the time domain, the 
harmonic representations of the input X(u;), where uj is the frequency 
in radians per second, and the output Y (u>) are related by the simple 
expression 

Y{u) = H{u)X(uj) (1.1) 

where H( u/) is called the frequency response function for the system. 
This fact, which is known as the convolution theorem, is the basis of 
many techniques for the solution of differential and integral equations 
and is an aid to understanding the response of linear systems. 

Ease of interpretation (diagnostics). Many time signals are not easily 
analyzed in the time domain. For example, figure 3 displays the voltage 
output time history of a microphone recording the noise radiated by a 
supersonic jet operating in an off-design condition. Such time histories 
are nearly unintelligible. However, although the time and frequency 
domain representations contain precisely the same information in the 
sense that one may be recovered from the other by integration, the 
generation and potential effects of a signal may often be more easily 
understood in the frequency domain. For example, figure 4 shows the 
power spectral density, as a function of frequency, for the time history 
in figure 3. From the frequency domain perspective in figure 4, it can 
be seen that most of the power in the signal is concentrated near 5 kHz. 
In addition, a screech tone, caused by oscillation of the shocks in the jet 
due to a natural instability of the jet plume, is apparent near 5.2 kHz. 

Another dramatic example of such analysis was given by Blackman 
and Tukey 1 in Measurement of Power Spectra when they quoted a letter 
from Walter E. Munk: . . we were able to discover in the general 

wave record a very weak low-frequency peak which would surely have 
escaped our attention without spectral analysis. This peak, it turns out, 
is almost certainly due to a swell from the Indian Ocean, 10,000 miles 
distant. Physical dimensions are: 1 mm high, a kilometer long/’ 
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.2 J .4 A .1 .7 

TlME(ms«c) 

Figure 3. Noise radiation by supersonic jet. 



3 • • 12 

Fr«qu«ncy, kHz 

Figure 4. Power spectral density of noise radiation. 
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Introduction to Time Senes Analysts 

Because of their ability to extract information from highly variable 
records, spectral analysis techniques are widely applied in fluid dy- 
namics, acoustics, and vibration. In addition, such analyses are read- 
ily accomplished with either modern digital computers or specialized 
hardware. 

Ease of simulation . Often it is necessary or desirable to excite a 
system with a particular time history or class of time histories, either 
in the laboratory or on a computer. However, it is not practical 
to develop an excitation system for each individual signal. Thus, 
if the signal can be decomposed into its constituent harmonics and 
ordinary oscillators (or harmonic functions on a computer) used to 
produce the excitation, simulation becomes appreciably easier and less 
expensive. This technique is used by electrodynamic shakers in missile 
and automobile vibration testing, for example. 

1.2 Deterministic or Random? 

An important question in the extraction of information from a record 
like that shown in figure 1 is as follows: Is the record unique or is 
it merely representative of an ensemble of records which might have 
been obtained? For example, a smoother version of figure 1 might be 
a record of the elevation as a function of distance along the track of 
an amusement park ride. If one were designing a cart to traverse that 
particular track, then this would be the unique (deterministic) record of 
interest. On the other hand, figure 1 might be a record of the vertical 
gust velocity as a function of time experienced by an aircraft flying 
through a thunderstorm. If one were designing an airplane, then the 
record would be viewed as merely representative of an ensemble of data 
that might have been obtained in many different thunderstorms. In this 
random case, the particular properties of the record at hand are not as 
interesting as the average properties of the whole ensemble of records 
which might have been obtained. To discuss such data, the concept of 
a random process must be introduced. 

Although many of the techniques developed in this monograph 
are equally applicable to deterministic data records, the monograph 
will primarily be concerned with the extraction of information from 
records that may be considered sample functions of random processes. 
The analysis of such nondeterministic records is a rapidly changing 
field with new techniques being devised continually. It is also a field 
requiring sound engineering judgment in the application of techniques 
and interpretation of results; many pitfalls await the unwary. It is hoped 
that this monograph will give the reader the understanding necessary 

4 


l l l i. r r ; ; : : 


UliULlililiLiLlIli U U li 



Chapter I Introduction 

for the proper application and interpretation of time series analysis 
techniques. 
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Chapter II 
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Harmonic Analysis 


Although several seemingly different forms of harmonic analysis are 
in common use today, they are all special cases of what is now called 
generalized harmonic analysis. While this monograph will occasionally 
require use of the full power of this elegant theory, for the most part, the 
ordinary idea of a Fourier transform as developed in advanced calculus 
courses will suffice. 


2.1 Fourier Transform Pair 

If a function f(t) is such that the integral 



1/(01 

(i + * 2 ) a 


dt 


converges for some a > 0, then f(t) may be written 


( 2 . 1 ) 


f(t)m r° F{u)j*<lu (2.2) 

J-OQ 

where 

f(w) = 2 sb/l /(t) *"*■'** (2 - 3) 

is called the Fourier integral transform 2 of f(t). If the variable t is - 
actually time, then cj is the frequency in radians per second. Since 
e luJt = cos wt + isinu/t, equation (2.2) provides a representation of 
the function f(t) in terms of the periodic sine and cosine functions. 
Equations (2.2) and (2.3) form what is called a Fourier transform 
pair and are the fundamental Fourier transform pair that will be used 
throughout this monograph. 

As will be discussed further in chapter IV, one has considerable 
freedom in defining a Fourier transform pair. In particular, the 
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Introduction to Time Series Analysis 

factor {2ir)~ l may be placed before either the time or the frequency 
integration. The reason for the above choice is that it allows the total 
power in a random process to be obtained by integrating its power 
spectral density over all frequencies with no proportionality factor 
required. The inherent simplicity and elegance of this relationship 
seems to the author to be worthy of achieving. On the other hand, 
with the above definition of a transform pair, the (27r) -1 factor must be 
reserved for the frequency integration in defining the frequency response 
function of a linear, shift-invariant system, in order to preserve the 
simplicity of fundamental relations such as equation (1.1). Thus, this 
monograph will violate the above convention in the single case of a 
frequency response function. 

The Fourier integral representation for the function f(t) (eq. (2.2)) 
converges to f(t) at every point where f(t) is continuous. If f(t) is 
discontinuous at some point t = to , then the integral in equation (2.2) 
will converge to 

/(#) + /(* o) 

2 

the average of the right- and left-hand limits of f{t) at t — to, provided 
that these limits exist. 

2.2 Examples 

The Fourier integral (eq. (2.2)) has different characteristics depen- 
dent on the properties of f{t). Consider the following special cases. 

Transient functions. If f(t) is bounded and approaches zero as 
|t| — oo, it certainly satisfies the condition that integral (2.1) converge 
for a > 0. In this case, F{oj) is an ordinary (often complex) continuous 
function. For example, if 


/ft)./*”* (**°) 

\ 0 (Otherwise) 

where (3 is real and positive, then 

F(u) = (0 + iu)~ l /2ir 

which is complex. If the magnitude of F( u/), 

|F(u/)| = { 0 2 + w 2 ) -1/2 / 2 -k 
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Chapter II Harmonic Analysts 


|f(go)| 



is plotted as shown in figure 5, it can be seen that f(t) is represented 
by harmonic functions having a continuum of frequencies. 

Periodic functions . If f{t) is bounded and periodic with period p, 
it again satisfies the condition associated with integral (2.1). However, 
unlike the situation illustrated in figure 5, is nonzero only at a 
discrete set of frequencies. To see this, note that periodic functions may 
be expanded in the familiar Fourier series, that is, 

oo 

/(«) = Y + E (OnCOScOni + fcnSincJnO 

i n= 1 

where u n = 2mr/p. This series also converges to f(t) at all points 
where f(t) is continuous. Introducing the mathematically convenient 
concept of negative frequencies allows this series to be written as 

/(*)- E F ^ nt ( 2 -4) 

n=— oo 

where 

Fn = an ~ — = - [ P fit) dt ( 2 . 5 ) 

2 P Jo 

and n — — u; n . 

Now, consider the so-called Dirac delta function 6(u> — u/), which is 
defined by the relations, 


and 


<5(<jJ — u/) =0 (cJ 7 ^ u/) 

/ oo 

g{u) 6 (uj — u/) du = g(^) 

-oo 


where g(u/) is any “fairly good” function, 3 that is, differentiable and 
well behaved at infinity. Most functions met in the real world are of 
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this class. The delta function, which is defined by its integral property 
above, is actually a generalized function; that is, it lies outside the class 
of ordinary functions. It may be envisioned as the limit as Au — 0 of 
the rectangular function 


, j zfc (|w-u/|<%*) 

D(uj - J) = < 

{ 0 (Otherwise) 

whose width is Aai, whose height is 1/Aur, and whose integral is unity. 
As Auf 0, the amplitude of this function grows without bound, but its 
integral is unchanged. The delta function has a long and controversial 
history, being first introduced without proper mathematical justifica- 
tion by those who found it exceedingly convenient. Only in recent years 
has it been placed on a rigorous foundation 3 to the satisfaction of the 
mathematicians. 

The delta function arises in the analysis of periodic functions because 
periodic functions are not square integrable; that is, f 2 (t) dt 
is unbounded if /(£) is periodic. As the existence of this integral 
is a requirement for the nongeneralized Fourier transform to exist.- 
harmonic analysis of such functions must rely on the full power of 
generalized harmonic analysis. From the definition of the delta function, 
it can be seen that the function /(£) = e luJ 1 may be expressed as 

f{t) = = f°° 6(u - u/) e Mt du 

J — oo 

If this relation is compared with equation (2.2), it follows that the 
Fourier transform of /(£) = e luJ t is 

F(u;) S(uj — u/) 

Thus, by equation (2.3), the highly useful relation 


6 { 



e -t(u;-u S)t dt 


( 2 . 6 ) 


is derived. This equation is of fundamental importance and will be used 
many times in this monograph. 

As an example, only because of equation (2.6) is it possible to 
develop the Fourier integral transform of a periodic function. If 
equation (2.4) is used in equation (2.3) and equation (2.6) is applied. 
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Chapter II Harmonic Analysis 


|F(CO)| 


-COg -C0 2 -C0 1 0 C0 1 C0 2 COg 

Figure 6. Fourier transform of periodic function. 

then 

FM= f; F n 6(u-uj n ) (2.7) 

n=-oc 

which can be seen to be nonzero only at a discrete set of equally spaced 
frequencies. Thus, the Fourier transform of a periodic function has a 
discrete structure. The magnitude of this relation is shown in figure 6, 
since the F n '$ are generally complex. In this figure, the arrows have 
been used to represent the delta functions. It should be mentioned that, 
even though the delta functions are unbounded, discrete transforms 
are generally plotted as an amplitude spectrum with the height of the 
arrow indicating the coefficient magnitude, |F n |, which is actually the 
contribution to the integral of the spectrum at the frequency u = u n . 
Such amplitude spectra are even functions of u/ since 

F-n = F n * 

23 Convolution Theorems 

One property of the Fourier integral transform that will be used 
repeatedly in this monograph is its ability to transform products. 
Consider the transform of the product of two time functions: 

If the individual time functions f(t) and g(t) have Fourier integral 
transforms F(cj) and G(u/), respectively, then 

f{t) = [°° F{u)e lu,t duj 

J — oc 
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and 

g(t) = f°° G(u) e lwt du 
J —OO 

These relations allow the transform of the product to be written 

h j^J^t)e~^dt 


- r° dte _iwt f°° du'F(J)e iu ' t f°° du"G( u")e M ” t 

J —oo J —oo J -oo 


= f°° du' [°° du" F{u')G(u") -i- /*°° <fc e -»'(w-w'-w")t 
J —oo J —oo 2 tt 7 —oo 

where the last integral may be recognized from equation (2.6) as 
6(u - u' - a/')- Thus, carrying out the integration over either J or 
u" yields the fundamental relation 


= [°° du' F{u')G{u-J) = f°° du" F(u - u")G(u") 
J— OO J — oo 


( 2 . 8 ) 


called the frequency convolution theorem. 4 Often this theorem is 
written 

Mg(t) F(u) * G(u) 

where the asterisk indicates convolution and the double-headed arrow 
indicates a Fourier transform pair. In words, this theorem states that 
the Fourier integral transform of the product of two time functions is 
equal to the convolution integral of the Fourier integral transforms of 
the two functions in the frequency domain. This result will be applied 
extensively throughout this monograph. 

A converse of this theorem, called the time convolution theorem. 4 
may also be developed. The Fourier transform of a convolution integral 
in the time domain 


[ f(-r)g(t-r)dT = f(t)*g(t) 
J— oo 


satisfies the theorem 
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Chapter II Harmonic Analysis 


that its Fourier transform is given by 2 it times the product of the Fourier 
transforms of the individual functions f{t) and g{t). This theorem 
greatly simplifies the application of Fourier transforms in linear, shift- 
invariant systems. 

With the definition of the fundamental Fourier transform pair and 
some understanding of its properties, the tools are available to begin 
consideration of the application of harmonic analysis to time histories 
that may be considered to result from random phenomena. 
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Chapter III 

Random Process Theory 


The theoretical foundation underlying the harmonic analysis of 
random time histories is random process theory. For much of the 
actual practice of such analysis, this foundation is buried so deeply 
that the user may not even be aware of its existence. However, proper 
understanding and application of time series analysis techniques require 
its consideration. 


3*1 The Concept of a Random Process 

The subject of all random analyses is an experiment £, which could 
be performed repeatedly, at least conceptually. For example, one might 
test an aircraft part to failure in fatigue. Each performance of such 
an experiment is called a trial and its result is an outcome <;. If it is 
impossible to predict , independent of the number of times the experiment 
has been performed previously, what the result of a given trial will be, 
the experiment is said to be random. In this case, there are more 
than one and, perhaps, an infinite number of possible outcomes to the 
experiment. The set of all possible outcomes is called the sample space 
S = {fi, • •} of the experiment. In the real world, the sample space 
generally contains an uncountable number of possible outcomes. In 
analyzing the experiment, an attempt is made to statistically describe 
this whole set of possible outcomes. 

Now, consider an operator that yields a function x(t) of the parame- 
ter t for each outcome of the experiment, such as a strain gage measuring 
the strain at some point on the specimen as a function of time in the 
fatigue test mentioned above. This operation is shown schematically in 
figure 7. One time history Xj(t) is related to each outcome of the 
experiment £ . The ensemble of all possible time functions that might 
be obtained in this way, is called a random process. Three of 

the possible time histories included in this ensemble are shown in fig- 
ure 7. The single function obtained on a given trial is called a sample 
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Figure 7. Schematic of a random process. 

function , , or realization , of the random process. For example, the ex- 
periment might be launching the Space Shuttle and the operator might 
be a pressure sensor mounted somewhere on the fuselage. Although 
the gross characteristics of the pressure signal would be predictable de- 
terministically, there will be random variations due to wind loading, 
fuel burn rates, etc. Thus, the pressure time history produced by this 
transducer on a given launch would be a sample function from the ran- 
dom process made up of all possible pressure time histories that might 
be produced in this way, some of which have occurred on past Shuttle 
flights or will occur on future Shuttle flights. 

Description of random processes . The most complete description 
possible for a random process is given in terms of its distribution or 
density functions. Consider the event {X(£o) < xq} that the random 
process X(*) (dependence on the outcome q is understood) takes a value 
less than or equal to some chosen number xq at time to. Imagine that 
the experiment £ is repeated many times. Each time the experiment is 
conducted, any one of the possible outcomes q and, thus, any one of the 
time histories making up the ensemble representing the random process 
X(t) could occur. However, on any given trial, one could examine the 
resulting time history to see whether its value at time to was less than or 
equal to xo- If the experiment is repeated N times, then one could form 
the ratio Ng/N of the number of times Ng that the event occurred to 
the number of times N that the experiment was repeated. This ratio 
will be between zero and one. The probability of the event may then be 
defined as the limit of this ratio as the number of repetitions approaches 
infinity, that is, 

P{X(t 0 ) < x 0 } = Jim ^ 

;V— > oc .V 
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Chapter III Random Process Theory 


assuming the limit to exist. Although some mathematicians are 
not totally comfortable with this intuitive definition, 5 it is the ulti- 
mate relationship between the theory and the real world. 6 In words, 
P{X(to) < xo} is the likelihood of occurrence of a sample function 
whose value is less than or equal to xo at time Jo* 

The value of P{X(Jq) < xq} generally depends on Jo and xq. Thus, 
one may expand the concept by dropping the subscripts and thinking 
of x and t as variables to define the first order distribution function of 
the random process X(J): 

F x (x;t) = P{X(t)<x} 

This function generally depends on both x and t and always satisfies 


0<F x {x]t)<l 


An exactly equivalent, but mathematically more convenient, descrip- 
tion is given by the first order density function of the random process 
*(*): 


fx{x; t) 


dFxjxyt) 

dx 


( 3 . 1 ) 


The density function satisfies 



fx(x] t)dx= 1 


and 

fxfc t)> 0 

since the distribution function is a monotonically nondecreasing func- 
tion of x for each t. 

For many applications, the amount of information contained in 
the distribution or density function is more than is feasible, or even 
desirable, to know about the random process. In these cases, it is 
possible to introduce certain averages, or expected values , over the whole 
ensemble of functions comprising the random process. Suppose that 
</[X(J)] is any function of the values taken by the random process at 
time J, such as sin AT (t) or X 3 (J). Then one may define the expected 
value of g[X(J)] by 


E{g[X(t) \}=[ g{x)f x (x;t)dx 
J — oo 
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Introduction to Time Series Analysis 

where E is called the expectation operator. Thus, the operation of 
taking expectation merely amounts to multiplying the function by the 
density function for the random process at time t and integrating over 
all possible values of x. It is equivalent to averaging the values of the 
function at time t obtained by conducting the experiment repeatedly. 
The most useful of these expected values are 

Mean 

m x {t) = E{X{t)} m [ xfx(x; t) dx (3.2) 

J -CO 

which is the average value taken by the random process at time t. 
Mean square 

E{X 2 (t)}= r x 2 fx(x-,t)dx (3.3) 

y-oo 

which is often called the “power” in the random process X(t). 

Variance 

0x(t) = E{[X(t) - mx{t)} 2 } = f [x - mx(t)] 2 fx{x;t) dx (3.4) 

J — oo 

which is a measure of the variation of the random process about its mean 
at time t. The square root of the variance, ls called the standard 

deviation . Note that the expectation operation, which is an integration, 
is linear. Thus, the expected value of a sum is the sum of the expected 
values. Further, the expected value of a constant or deterministic 
function is equal to that constant or deterministic function. Therefore. 

= E{[X{t) - m x (t)] 2 } = E{X 2 (t) - 2mx(t)X(t) + m 2 Y (t)} 

= E{X 2 (t)} - 2 m x (t)E{X(t)} + m 2 x (t) 

= E{X 2 (t)}-m 2 x (t) 

This expression indicates the fundamental relationship between the 
mean, mean square, and variance and provides a useful alternative way 
of calculating the variance. 

In the same spirit in which the first order distribution function was 
developed, one may define the second order distribution function of the 
random process X(£): 


= P{X{t i) < x { n x{t 2 ) < x 2 } 
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Chapter III Random Process Theory 


Here 0 indicates the set operation of intersection and may be read as 
“and.” Thus, the second order distribution is the probability that the 
random process is less than or equal to x\ at time t\ and less than or 
equal to *2 at time *2* Again, it can be interpreted as the likelihood of 
occurrence of a sample function having these properties. 

Likewise, the second order density function 


fx(.X\,X2\h,t2) 


d^Fxjx i,x 2 ;ti,i 2 ) 
dx\ dx 2 


and an expected value that depends on the values taken by the random 
process at two times 


'{ 3 (x(t 1 ),xu 2 )]}= jf^dx i 


OC 

OO 


dx 2 g(xi,x 2 )/x(xi, xi\t\,t 2 ) 


may be defined. Here g[X(ti), Xfo)] is again any arbitrary function 
such as exp[X(£i) + X(to)j- The most useful second order expected 
values are 


Autocorrelation 

/ OO r OO 

dxi / dX2X\X2jx{X\,X2\ti,tl) 

•OO j — OO 

(3.5) 

which is a measure of the linear relation between the values taken by 
the random process at times £i and t 2 - Note that R\{t, £) = £{X 2 (£)}. 

Covariance 

T X (ti,t 2 ) = £{[X(ti) - m x (£i))(X(£ 2 ) - m. Y (t 2 )]} 

= Rx{t i, f 2 ) - mx(ti)mx(t2) 

Note that r, Y (£, t) = <r Y (t) and that if mx{t) = 0, r\ Y (£i,£2) = 
Rxihih)- 

Correlation coefficient 


Px{t uh) 


vxih^xto) 


which can be shown to satisfy |p, Y (£i,£ 2 )| ^ 1- 
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Introduction to Time Senes Analysis 

The concept of a distribution function may be further generalized 
by defining the nth order probability distribution function 

Fxi x li x 2i t 2} --I ^n) 

= P{X(ti) < Xi n X{t 2 ) < X 2 n * • • n x(t n ) < x n } 

for the random process X(t) for all n and any collection of times 
• ••**«• In words, the nth order distribution function is the 
likelihood of occurrence of a sample function whose value x(t) is less 
than x\ at time and less than X 2 at time h and ■ • * and less than x n 
at time t n . 

An exactly equivalent description is given by the nth order density 
function 


fx ( x li x 2i •••> *n\t l, *2, ^n) 

_ d n F x {x i, x 2 , x n ;i i, *2, in) 

<?Xi dl2 ' 4 * dx n 

and, if g(X(*i), X(*2), •* • , AT(t n )] is any function of the values taken 
by the process at times t 2 , . . . , t n * then the expected value of this 
function is defined by 

E{g[X{t l ),X{t 2 ),...,X(t n )\} 

/ oo roo roo 

dx l dX 2 ■ ■ ■ I dx n g(x 1,12 Xn) 

-OO J — OO ■/ — OO 

x fxix l, 12, •••, t 2 , .... i n ) 


The description of a random process by its nth order distribution and 
density functions provides the maximum possible information about 
the process and will be required by some of the analysis later in this 
monograph. 


Normal random process. A random process JV(t) is said to be normal 
if all its density functions are of the form called Gaussian. In particular, 
the first is 


/x(*;0 



(x — m) 2 
2a 1 


( 3 . 6 ) 


where m = and a 2 — cr 2 x (t), and the second is 
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Chapter III Random Process Theory 


/.y( 2 1» x 2;*1>*2) = 


2n-<Tl<72\/l — ~p * 


exp < - 


2(1 -P 2 ) 


(il - mi ) 2 


2p(xi - m t )(x 2 ~ mg] (x 2 - m 2 ) : 

<? 1<72 


1 

1 ) 


(3.7) 


where mi = mx(*i)i m 2 = mx(*2)i = ^x(^l)t <?\ = ^x(^)t and 

P = Px(^l»^2)* Note that if p = 0, 




(3.8) 


Similar closed form expressions exist for the density functions of higher 
orders. Normal random processes are useful because they seem to 
describe many phenomena that occur in the real world. Further, many 
of their mathematical properties are quite simple. For example, any 
linear operation operating on a normal random process yields another 
normal random process. 

Calculus of random processes . A calculus of random processes, 
called mean square calculus, has been developed based on a concept 
called mean square convergence. This calculus is a fascinating study 
in itself and the interested reader is referred to the text by Papoulis. 
However, for most purposes, it is sufficient to know that all the ordinary 
operations of calculus, such as differentiation and integration, may be 
applied to random processes with certain mild restrictions. Only the 
concept of a limit must be interpreted differently. For example, a 
random process X(t) is differentiable at t = to if its autocorrelation 
has a second partial derivative where ti — t<i = to- Likewise, a 
random process X(t) is integrable over the interval / = ( 1 1 , ^2 ) if its 
autocorrelation is integrable over the area I x I. 

Consider a random process Y(t) given by the integral of a random 
process X(f) with respect to some kernel function K(t;r), that is, 


rm 

Y(t) = / 

Ja(t) 


K(t ; r)X(r) dr 


where the limits a and b are arbitrary, but deterministic. Then, the 
expected value of F(t) is given by 


E{Y(t)}= f m K(f,r)E{X(r)} dr 

Ja(t) 
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Introduction to Time Series Analysis 

since the expectation operator, being linear, may be interchanged with 
integration. This fundamental result will be used repeatedly throughout 
the monograph. 

3.2 Random Variables 

Anyone who has read this far in this monograph is undoubtably 
familiar with the concept of a random variable, which varies over a 
set of numbers which are each assigned to one outcome of a random 
experiment. For example, the number of dots on the upturned face 
when a die is rolled is a random variable. There is also an intimate 
connection between random processes and random variables. Recall 
that a random process was defined as an ensemble of time functions, 
one of which was assigned to each outcome of the random experiment. 
If the time parameter in a random process X(t) is considered fixed, at 
t = Jo say i then X(<o) is just a number associated with each outcome 
and is therefore a random variable. Thus, random variables may be 
described in the same manner as random processes, having distribution 
and density functions and expected values. 

The normal random variable . A normal random variable X is 
described by the density function given by equation (3.6) with 

E{X} = m E{{X -m) 2 } =<r 2 

Two normal random variables X{ and X 2 defined on the same exper- 
iment £ would have the joint density function given by equation (3.7) 
with 

£{*!} = rrn E{{X x -m x ) 2 }=<rl 

E{X 2 } = m 2 E{(X 2 - m 2 ) 2 } = a\ 

and 

E{(Xi - mi)(X 2 - m»)} 

P = = — 

CTi<7 2 

If p = 0, then the joint density factors as shown in equation (3.8), and 
the random variables Xi and X 2 are said to be independent. 


The chi-square random variable . An important random variable in 
understanding the variability of spectral estimation techniques is the 

chi-square random variable. Suppose that Y x for i = I, 2 k are 

independent, normally distributed random variables with zero means 
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Chapter III Random Process Theory 


and unit variances, that is m t = 0 and = 1. Then the random 
variable ^ 

5 = Yf + Y% + • • • + H *i 2 (3-9) 

1=1 

is called a chi-square random variable with k degrees of freedom. 
In other words, the number of degrees of freedom is the number of 
independent random variables whose squares are added. The mean 
value of 5 is 

k 

e{S} = Y,e{y?} = i< 

i=l 

and its mean square value is 

k k Jc 

£{S 2 } = 22 E{y; 4 } + 22 £ E{Y 2 }E{Y 2 } 

i=i 1=1 ;=1 

«W 

= 3it + jfc 2 - it = 2k + k 2 


since for a normal random variable, 

E{Y*} = 3E 2 {Y t 2 } 

Thus, the variance of the chi-square random variable is 
£r| = E{S 2 }-£ 2 {S} = 2fc 
Note that this random variable has the property that 


^! p \/!- 0 “ *-°° 

which shows that the variability of this random variable relative to its 
mean becomes less important as the number of degrees of freedom is 
increased. The probability density function of the chi-square random 
variable is given by 


fsivk) 


1 jfc/2— 1 

2 */ 2 r(Jb/ 2 ) 


e-»/ 2 


(«> 0 ) 


(3.10) 


where T is the gamma function. From this density, plots of the variation 
of the chi-square random variable about its mean at various numbers 
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Introduction to Time Series Analysis 

of degrees of freedom may be produced. Such a graph is included in 
chapter VIII. 

33 Jointly Distributed Random Processes 

The concepts used in describing a random process may be extended 
to two random processes defined on the same experiment. For example, 
consider the experiment of measuring noise transmission through the 
fuselage of an aircraft. One random process might be the acoustic 
pressure measured by a microphone placed outside the fuselage near 
the engine while the other might be the acoustic pressure measured by 
a microphone inside the fuselage near the pilot’s seat. 


Description. Consider the two random processes X(t) and K(£)* 
The joint distribution function of the two random processes is defined 
as 


F X y(x 1,*2i X n , 2/1, 1/2, • Vmlti, *2, *n+m) 

= P {X(tx) < x x n X(t 2 ) < x 2 n • • • n X{t n ) < x n 
r\Y(t n + x ) < yi n Y{tn+ 2 ) < y 2 n • • • n Y(t n + m ) < y m } 


where there is no importance to the order of the times t\, t 2) . . . , t n +m* 
The joint density function is then given by 

/xy(* li X2, Ini 2/1, 2/2, •••, Vm\ ti , t?, . .., t n +m) 

_ a n " hm Fxy(xi, X2 Xn, VI, 2/2, • ••, ym\ tj, t?, .... t n +m) 

dx\ dxi . . dx n dyi dy? ■ - • dym 

Joint expected values may also be defined in a similar manner. The 
most useful one is 

Cross correlation 


RxY(h,t 2 ) = E{X(t x )Y(t 2 )} (3.11) 

Note that, by convention, the first time goes with the first subscript 
variable. Thus, 

RYx(h,t2) = £{m)X(t 2 )} 

and, in general, R.xy^ l. <2) / 
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Chapter III Random Process Theory 

Independence . Two random processes X(t) and Y (t) are said to be 
independent if 

Ixy( x 1* x 2> x n, yi» 2/2> ym; *1, t2» *n+m) 

= /.v( x li x 2i x n; ^ 1 1 *2> ***i ^n) 

x /y(yi» y2» *•** ymi ^n+l» ^n+2> Wm) 

that is, if their joint density function factors into a product of their 
individual density functions. 

Uncorrelated random processes . Two random processes X(£) and 
Y(t) are said to be uncorrelated if their cross correlation satisfies 

RxY(h,t 2 ) = E{X(^)}£{y(^)} = m x {h)m Y {t2) 

Independent random processes are uncorrelated, since 
E{X{ti)Y{t 2 )} = [ dx f dyxyf XY {x, 

J — OO J —oc 

/ OO TOO 

dxxf x (x;t i) / dyyf Y {y;t 2 ) 

-OO » “ ’OO 

for independent processes. However, the converse is not necessarily 
true. 

Complex random processes . Random processes that take complex 
values also arise from time to time. These are easily handled by writing 
the complex- valued random process Z(t) as 

Z{t) = X{t) + i Y(t) 

and considering the real and imaginary parts, X(<) and y(£), respec- 
tively, to be two real-valued random processes defined on the same 
experiment. For complex-valued random processes, the autocorrelation 
is defined by 

Rz{ti,t 2 )-E{Ziti)Z m {t 2 )} (3.12) 

where the asterisk indicates the complex conjugate. This makes 

R z (t,t) = E{\Z(t)\ 2 } 

which is real and non-negative. 
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Introduction to Time Senes Analysts 

3.4 Stationary Random Processes 

A useful subdivision of the class of random processes is based on 
behavior in time. Some random processes, such as the velocity compo- 
nent measured by a hot-wire anemometer at a point in a turbulent jet 
running at constant speed, are reasonably independent of the precise 
value of the time. That is, even though the velocity fluctuates quite 
rapidly, measurements made at different times are quite similar in their 
average properties. Other random processes have average properties 
that vary appreciably with time; for example, the load demand on an 
electric power generating system depends on whether it is night or day 
or winter or summer. 

Random processes whose statistical properties do not vary with the 
particular value of time are much more amenable to analysis than 
those whose statistical properties do. Thus, many more powerful 
techniques have been developed for extraction of information from 
them. Such processes are said to be stationary and, in fact, most of 
the techniques developed in this monograph will be limited to stationary 
random processes . 

A random process is said to be strictly stationary if and only if its 
nth order density function is independent of the origin of time for all 
n. From this requirement, it can be shown that the first order density 
is independent of time, that is, 

/x(x;0 = fx(x) 

Thus, all expected values calculated from this density must also be 
independent of time, for example, 


m x {t) = m x 

<7x(t) = a x 


That is, the mean and variance of stationary random processes must 
be constants. For the second order density, independence of the origin 
of time requires that 

fx( x u x 2\hih) = /.y( x 1i*2;0, «2 - *1) * fxi x i»x 2 ;r) 

where r = ” * 1 * That is, the second order density depends only 

on the difference of the two times t\ and £2- Thus, all expected 
values calculated from this density must display the same property. 
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Chapter III Random Process Theory 


In particular, 

= R.xi T ) 

I\y(* 1.*2) = r x (r) 

P.v(*l>*2) = Pxi T ) 

Thus, the autocorrelation, covariance, and correlation coefficient of 
stationary random processes depend only on the time difference r. 
It is this property that aids the analysis of such processes, since the 
autocorrelation depends on only one independent variable rather than 
two. 


Weak stationarity . Although the mathematical definition of sta- 
tionarity depends on the density functions, for most of the analysis in 
this monograph to be valid, only a weaker form of stationarity, based 
on the expected values, is required. A random process X(£) is said to 
be weakly stationary if 

E{X{t)) = m x E{X(t i)X(t a )} = Rxir) (3.13) 

That is, its mean is constant and its autocorrelation depends only on 
the time difference. 

Joint weak stationarity. The concept of weak stationarity may be 
extended to two random processes defined on the same experiment. 
Two random processes X(£) and Y(t) are said to be jointly weakly 
stationary if they are each weakly stationary, and 

EiXWYiti)} = R xy (t) (3.14) 


Properties of the autocorrelation function of a weakly stationary 
random process . Suppose that X(t) is a weakly stationary random 
process with mean zero. (Since the mean value of a weakly stationary 
random process is constant, if one had a weakly stationary random 
process y(£) with mean my, one could merely substract the mean from 
the data and define 

X{t) = Y{t) - my 

which is a weakly stationary random process with mean zero.) Then, 


E{X(<i)X(t 2 )} = R x (t 2 - 10 = fl x (r) 
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Introduction to Time Series Analysis 


where r is called the lag time , since it is the time by which the second 
value of the random process X(i) lags the first. Further, since order is 
immaterial in the expectation, 

E{X(t 2 )X(h)} = Rxih - 1 2 ) = Rx(-r) 


Thus, it follows that 

Rx( r ) = Rx(~ r ) 

That is, the autocorrelation must be an even function of r. Further, 

/2 X (0) = £{X 2 (*)} >0 

which says that the value of the autocorrelation at a lag of zero must 
be non-negative. Also, 

E{[X(t!) ± X(t 2 )} 2 } = 2(* x (0) ± * x (r)] > 0 


which implies that 

#x(0) ^ l^x( T )l 

Thus, the absolute value of the autocorrelation at any lag can never be 
larger than its value at zero lag. Finally, recall that autocorrelation is 
a measure of the linear relationship between X{t) and Af(t + r). If X(t) 
is a completely random process (i.e., its mean is zero and it contains 
no periodic signals), this linear relation weakens as r increases and, in 
fact, 

lim R x {t) = 0 

|r|— oo 

With this understanding of the properties of the autocorrelation func- 
tion, it is possible to sketch the autocorrelation function for a typical, 
completely random process as shown in figure 8. Further if the deriva- 
tives exist, it can be seen that 


dRxir) 

dr 


= 0 

r= 0 


since the autocorrelation is even, and 


fRxjj) 

since the origin must be a maximum. The properties of the autocorrela- 
tion function will be important in understanding its Fourier transform, 
the power spectral density, which will be introduced in the next chapter. 
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Chapter III Random Process Theory 



Figure 8. Autocorrelation of a typical, completely random process. 


29 



* v 

L L 


UUUlllililiLlilililiUU 





.1 


PRECEDING paqe blank not filmed 


Chapter IV 

Power Spectral Analysis 

The technique of power spectral analysis of stationary random 
processes was developed about 50 years ago. Power spectral analysis 
was first utilized by the electrical engineering community, particularly 
in the field of communications 7,8 ; thus, much of the terminology of 
the technique comes from electrical engineering. As will be seen, this 
terminology sometimes creates confusion. In recent years, advances 
in digital computers and hard-wired spectral analyzers have allowed 
applications of power spectral analysis to grow exponentially. 

There is a fundamental difference between power spectral analysis 
and ordinary harmonic analysis. Instead of developing a harmonic 
representation of the sample functions of the random process itself, 
in power spectral analysis, one develops a harmonic representation of 
the autocorrelation of the random process. The autocorrelation is a 
well-behaved deterministic function, and from this representation, one 
can infer average properties of the random process. 

The power spectral density can be defined as the ordinary Fourier 
integral transform (eq. (2.3)) of the autocorrelation function of a 
stationary random process, that is, 

s xM = “ /_~ Rx(r) e- twr dr (4.1) 

By the inversion relation (eq. (2.2)), the autocorrelation can be recov- 
ered as 

Rx(r)= f°° Sxi^e^du (4.2) 

This Fourier transform pair is often called the Wiener-Khinchin rela- 
tions because Wiener and Khinchin derived the pair from a harmonic 
representation of the random process instead of stating the pair as a 
definition as done here. 
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Introduction to Time Series Analysis 

4.1 Properties of Power Spectral Densities 

Certain properties of power spectral densities are readily apparent 
from the Wiener-Khinchin relations. For example, from equation (4.2). 


R X ( 0) = E{X 2 {t)} = f°° S X (w) du (4.3) 

J -oc 


It is this relation that gave power spectral density its name. Recall 
that if V and I are the voltage across and current through an electrical 
element of resistance R, then the power consumed by the element is 
I 2 R = V 2 /R. In electrical engineering, X(£) is frequently a voltage or 
current time history, and thus electrical engineers tend to think of X 2 (f) 
as power. For this reason, they called the mean square value, E{X 2 {t ) }, 
the “power” in the random process X{t). This is unfortunate, since 
the mean square value may have nothing to do with power at all and 
may be confusing in other fields, such as acoustics, where power has a 
different definition. However, it is too late at this point to change the 
terminology. 

Returning to equation (4.3), it can be seen that the “power” in 
the random process AT(t) may be obtained by integrating the power 
spectral density over all frequencies. Thus, the “power spectral density'’ 
is the density of power with respect to frequency, or the power per unit 
frequency, in the signal. One favorite examination question in this 
field is, what are the units of power spectral density? From the above 
discussion, it should be clear that the answer is 


Units of Sjy(^) = 


[Units of JT(f)] 2 
Units of frequency 


Thus, for example, if X(£) is an acceleration measured in ft /sec 2 
and frequency is measured in radians per second, the units of 5 y(^) 
are ft 2 /sec 3 . On the other hand, if X(£) is elevation measured in 
feet as a function of distance measured in feet, then frequency is 
measured in radians per foot and the power spectral density has units 
of ft 3 . Furthermore, if t is a spatial variable, then X(t) is said to 
be homogeneous in space (as opposed to stationary in time) and the 
frequency variable u is called wave number , or spatial frequency, and 
frequently denoted by k or p. 

Now, consider equation (4.1). If the exponential is expanded, 


1 f°° i 

Sx(u) = — / Rx{t)cosut dr - — / ffx( r ) sinu;r dr 
2tt 7—oo 27T 7—oo 
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Chapter IV Power Spectral Analysis 


However, the autocorrelation /?v( r ) is an even function of r. Thus, the 
second integral, being the integral of an odd function over even limits, 
is zero. Therefore 


1 f°° 

SxM = — J_^ #x(r)cosu/rdr 


is real. 

Further, coscjt is an even function of r and thus SxM may be 
written 

SxH = - 

From equation (4.4), it can also be seen that 

SxM = 5x(-w) 

That is, the power spectral density is an even function of cj. 

4.2 Problems in Comparing Power Spectral Densities 

The freedom inherent in the definition of a Fourier transform pair, 
as mentioned in chapter II, and the fundamental properties of a power 
spectral density result in there being no standard definition for power 
spectral density. This latitude often leads to problems in comparing 
power spectral densities obtained by different groups utilizing different 
definitions. 

The first ambiguity arises because theoreticians prefer to work in 
terms of radian frequency a;, defined for both positive and negative 
frequencies. However, engineers prefer to use cyclic frequency /, where 
u; = 27 t/, defined only for positive frequencies. The units of / are 
cycles per second (hertz). Note that, since the power spectral density 
is an even function of u ;, the mean square value of the process may be 
obtained from any of the expressions 

E{X 2 {t)} = r S x(“) du = 2 [°° SxM du = 4w f°° Sx(2*7) df 
J — o© -Jo JO 

Thus, engineers prefer to define the one-sided power spectral density: 
G X U) = 47rSx(27r/) (/ > 0) 

from which the power in the process may be obtained by 

E{X 2 (t)}= [ °°G x (f)df 

Jo 


r oo 

Jo 


Rxi r ) cose or dr 


(4.4) 
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Introduction to Time Senes Analysis 

All the modern spectral analyzers compute the function Gy(/). 

Second, older analog spectral analyzers use fixed, finite bandwidth 
filters. These analyzers do not yield a power spectral density at all, but 
an integrated power spectrum , that is, 

SP(n = f h Gx(f) df (fi <r< f 2 ) 

J f 1 

which amounts to integrating the power spectral density over some finite 
bandwidth {fi, f 2 ) > such as a third octave. An integrating analyzer thus 
assigns the total power in a bandwidth to some frequency /* within 
the band, chosen by a committee of workers in the field. To compound 
matters, many standards for various phenomena are still written in 
terms of this kind of data. 

Finally, in any Fourier transform pair, such as equations (4.1) and 
(4.2), the placement of the factor (27T)* 1 is completely arbitrary. What 
such a pair says is that if one starts with an autocorrelation, transforms 
it into the frequency domain, and then transforms back to the time 
domain, the autocorrelation is reproduced. Thus, rather than the pair 
given, no coefficient could be placed in front of the time integral and 
(2 tt)" 1 in front of the frequency integral, or (2 ir)~ x / 2 in front of both, 
or some other combination. Or, if it were preferable to work in terms 
of cycles per second, one could use the transform pair 

Sx(f) - f° Rx(T)'- i2 * fr dT 

J — 00 

Rx(r) - f°° Sx{f)^ vfr df 

J—o o 

with no coefficient in front of either integral. Which of these definitions 
is chosen has absolutely no effect on the reproducibility of the original 
function, as long as the definitions are used consistently . However, if 
ohe stops halfway through, at the power spectral density, the amplitude 
depends on which definition is being employed. The only solution to 
this problem when trying to compare power spectral densities is to refer 
to the documentation for the software or to the equipment operator's 
manual for the hardware and determine what transform pair is being 
employed. Then, conversion factors which will allow comparison can be 
developed. 

43 Interpretation of Power Spectral Densities 

A mathematical model that describes most stationary random 
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Chapter IV Power Spectral Analysis 


signals observed in real world situations is 

k 

Y(t) = Ao+^T A n cos(u ; n t + <p n ) + X(t) (4.5) 

n=l 


where the A n 's are constants, the u n 'sare a set of frequencies which 
are not necessarily harmonically related, k may be infinite, the 0 n ’s are 
independent random phase angles taken to be equally likely to have 
any value between zero and 27 t (i.e., no knowledge of the phase of the 
periodic functions is assumed), and X(£) is an independent completely 
random process. That is, most stationary signals can be modeled by a 
constant, a collection of periodic functions, and a completely random 
signal. For simplicity, Aq may also be interpreted as the amplitude of 
a periodic function of zero frequency. 

The mean value of the signal Y (£) is 

E{Y(t)} = Ao 

and its autocorrelation can be shown to be 



(4.6) 


where R\(r) is the autocorrelation of the completely random signal 

X(t). 

Note that if the time history contains periodic functions, the 
autocorrelation contains periodic functions of the same frequency. This 
might be surprising, as correlation is a squaring operation which ought 
to double the frequency (i.e., 2cos 2 u; = (1 +cos2w)). However, this 
frequency doubling does not occur. Further, the presence of periodic 
functions in the signal can readily be detected. Since lim /?v( r ) — 0, 


T l ilSo R y( t ) = 4) + H 4r cosw r»r 


n= 1 


Thus, if the autocorrelation does not approach zero for large r, the 
presence of periodic functions is to be expected. 

The power spectral density of the model signal given by equa- 
tion (4.5) is obtained by Fourier transforming equation (4.6). From 
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Introduction to Time Series Analysis 
equation (2.6), it can be shown that 

J cosw„re~ ,wr cfr = i[£(u/ - jj n ) 4- 6(uj + ^ n )] (4.7) 


Thus, 



(4.8) 


where Sx(u/) is the power spectral density of the completely random 
process X(t). 

Now since lim Rx( r ) = 0, Sx^u/) is the Fourier transform of a 

r— oo 

transient function and represents a continuum of frequency components 
as shown in figure 3. For example, if 


Rxi T ) - o 2 x e Q|r| 


then 


SxM = 


o\ a 
—A 

7T a 2 4* a; 2 


(4.9) 


where a is real and positive. Note that 5^(0) # 0 even though 
E{X(£)} = 0. That is, even if there is no dc component in the signal, the 
power spectral density is nonzero at zero frequency. Thus, in general, 
the power spectral density of the model stationary signal given by 
equation (4.5) appears as shown in figure 9, where again the arrows 
represent the delta functions. This general spectrum consists of a delta 
function at zero frequency produced by the mean of the process, delta 
functions at the frequencies produced by the periodic components, 
and a continuous distribution of power produced by the completely 
random part. 


4.4 Relation Between the Power Spectral Density and the 
Fourier Transform of a Random Process 

The relationship between the power spectral density, Sx{*>)i which 
is a Fourier transform of the autocorrelation function of X{t), and the 
Fourier transform, X(u), of the random process itself is of interest and 
importance. Here, the full power of generalized harmonic analysis is 
once again called into play since a stationary random process, not being 
square integrable, can only be Fourier transformed in the generalized 
sense. 
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Chapter IV Power Spectral Analysis 


Sy(CO) 



A stationary random process X(t) satisfies equation (2.1). Thus, 
X(t) has a generalized Fourier transform given by 


x(u) "^IZ> x{t)e ' iutdt 


(4.10) 


Note that X(ut) is a complex-valued random process in frequency. The 
autocorrelation of X(u)) is given by (see eq. (3.12)) 

E{X( u,)*V)} = A /°° r 

4tt J — oq J — oo 

Introducing the variables t and r such that 


: h + h 

t = — - — r = t 2 -t i 


yields 


£{X(«)XV)} = £ j^drR x {r)e ^^ T / 2 

xif «£ e -*(w-w')t 

27T 7—00 

= S* ( ~y~ ) <5 ( w-w> ) 

That is, the autocorrelation of X(ui) is zero except when w = uA Thus, 
it follows that 

Sx(w) = f°° E{X(oj)X'(u’)}du' (4.11) 

J — oo 
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Introduction to Time Series Analysis 

That is, the power spectral density may also be interpreted as the 
integral of the autocorrelation of the Fourier transform of the random 
process over all frequencies. Equation (4.11) is closely related to the 
expression for the power spectral density utilized in the finite Fourier 
transform approach to be discussed in chapter VII. 

4.5 Cross Spectral Density 

If X{t) and Y{t) are jointly stationary random processes, it is 
possible to define the cross spectral density as the Fourier transform 
of the cross correlation: 

Sxy(u) = 2 ^ J RxYi T ) e luJT d T (4.12) 

The cross correlation is regained by 

Rxy( t ) — f SxyM (4.13) 

J — oo 

However, the cross correlation is not in general an even function of r. 
Thus, if the exponential is expanded to obtain 

1 i 

SxyM = r- / RxYi T ) cos uJTdT - — / Rxy( t ) sin dr 
lit J —OO j —oo 

the second integral does not necessarily vanish, and the cross spectral 
density is, in general, complex, having a real part 

1 f°° 

R*[Sxy (<*;)] = 2n J flxy^cosu/rdr 

which is an even function of u; and is called the co-spectrum . and an 
imaginary part 

Im [S xyM] = / Rxy( t ) sinu/r dr 

27T J - oo 

which is an odd function of u ; and is called the quad- spectrum. For 
plotting purposes, the cross spectral density is usually represented in 
polar form, that is, 

Sxy( V) = |S.YvMI«" xy(J) 
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Chapter IV Power Spectral Analysis 


and its magnitude 

IS.vyMI = {Re 2 [SxY(^)\ + Im 2 [S Y y(u /)]} 1 / 2 


and phase 


Oxy{ u ) = arc tan 
are plotted as a function of frequency. 


ImtS.yyMl 

Re[SjYy(w)] 
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Chapter V 

Random Processes in Linear Systems 


Consider a linear, shift-invariant system with a random process 
input X(t) as shown in figure 10. The linear system might be a bridge 
girder subjected to the random loading of automobiles of various weights 
and speeds arriving at various times, or it might be a Space Shuttle tile 
subjected to random heat loading during reentry. Any system that can 
be described by linear equations is linear. Shift invariance implies that 
the parameters of the system are independent of time. 

5.1 Description of the System 

The system in figure 10 is uniquely defined by its impulse response 
function h(t), which is the response of the system to an impulsive load 
<5(£) at t = 0. If the parameter t is actually time, then for all real 
systems, h(t) = 0 for t < 0, since there can be no response until the 
load is applied- A system for which this is true is said to be causal . 
For all real systems, the response also tends to die away with time 
because of damping and becomes effectively zero for t > T r where T r is 
called the response time. Thus, h(t) is a transient function, as shown 
in figure 11. 

In terms of the impulse response function, the output of the system 
Y(t) is given by the convolution integral 



h{a)X(t-a)da 


(5.1) 


This fundamental equation, which describes any linear, shift- 
invariant system, is developed in textbooks on linear system theory, 9 a 
knowledge of which is assumed in this monograph. If the input X(t) 
and the impulse response function h(t) each have Fourier transforms 
by the definition in equation (2.3), then from the time convolution the- 
orem (eq. (2-9)), the Fourier transform of the output Y(t) is given by 
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XCt) 



Linear 


System 


* Y(t) 


Figure 10. Random process input to a linear, shift- invariant system. 


h(t) 



t 


Figure 11. Impulse response function. 

the product of the Fourier transforms of the input and impulse response 
functions with a proportionality factor of 2 n. In order to remove this 2 tt 
factor and thus simplify the resulting expressions, the following defini- 
tion of the frequency response function H{u) for the system is employed 
in this monograph: 


H{u) = r h(t) e~ lut dt (5.2) 

J -OO 

The frequency response function also provides a unique description of 
the linear, shift-invariant system since the impulse response function 
may be recovered by the integral 

Mt) = — I" ff(w) c iwt du (5.3) 

^ J -OO 

Further, with the use of the definition in equation (5.2), the relation 
between the Fourier integral transforms X(u/) and Y{w) of the input 
X(£) and output Y(t), respectively, is the familiar expression 

Y{u) rn H(u)X(u) (5.4) 

by the time convolution theorem (eq. (2.9)). 

5.2 Properties of the Output Random Process 

Consider the linear system shown in figure 10 and suppose that the 
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Chapter V Random Processes in Linear Systems 

input X(0 is a weakly stationary random process in time which began 
at t =s — oo (or in the real world, at t < — T r so that any starting 
transients have died away). Then, the output Y(t) is also a random 
process, which is given by the convolution integral in equation (5.1). 
The question of interest is then, what are the characteristics of the 
output random process Y(£)? 

The mean value of Y (£) is given by 


E{Y(t)} = J_ 


OO 

-OO 


h{a)E{X{t-a)}da 


= rn x 



h(a) da = mjv# (0) 


where expectation has been interchanged with integration and H( 0) is 
the frequency response function of the system at zero frequency, or the 
“dc gain” of the system. Thus, the mean value of Y(t) is constant. 
Further, its autocorrelation is 


*v(*i.*2)-£{Y(*i)Y(*a)} 



da 2 /i(ai)/i(a?)£ {X(ti — a\ )X(t? — a?)} 

da? h(aj )/i(a?)i?x( r “ <*2 + <*i) 


= Ry(t) 


(5.5) 


which depends only on the time difference r. The output Y(t) is 
therefore also a weakly stationary random process, since its mean is 
constant and its autocorrelation depends only on the time difference. 

Likewise, the cross correlation of X(t) and Y(t) depends only on the 
time difference: 

RxY(h,t2) = E{X{ti)Y(t 2 )} = r h(a)R x (r-a)da = R XY {r) 

J -OO 

(5.6) 

Thus, X(t) and Y'(i) are jointly weakly stationary. 

Since Y(i) is stationary, its power spectral density is defined. By 
using equation (5.5) and interchanging the order of integration, it can 
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Introduction to Time Series Analysis 
be seen that 


SyM 


r 

**J- c 

roo /*oo /*o 

■sL‘ n ~L*'L 


R Y ( T )* dr 
■oo 


day h{a\ )h{&2)Rx ( T ~ Q 2 + Q 1 ) 

-OO j — oo 

/ oo /*oo 

^( a l ) e lwai / (iaj A(a^) c“ twa2 
•oo ^ — oo 

rOO 

rfrfl x (r-a 3 + Qi) e - , “( r - a '* +0 >) 

* — oo 

= ff*M*(w)S x (w) 


Thus, 

Sy(w) = |//(u;)| 2 S X (u,) 


(5.7) 


which says that the power spectral density of the output signal is just 
the square of the magnitude of the frequency response function times 
the input power spectral density. This simple relation ,, which is valid 
for any linear , shift- invariant system , is the fundamental reason for 
development of power spectra/ analysis in terms of sines and cosines . 
Although other complete sets of orthogonal functions could be used, 
equivalent relations corresponding to equation (5.7) would not exhibit 
such simplicity. It might be mentioned that equation (5.7) can also be 
derived from equation (4.11) with the use of equation (5.4). 

Since X(t) and Y(t) are jointly weakly stationary, the cross spectral 
density is also defined. Using equation (5.6) and the same approach 
yields 


Sxy M = ± Rxy(t) e-™ dr = H{u)Sx M (5.8) 

That is, the cross spectral density is just the frequency response function 
times the input power spectral density. Note that equations (5.7) and 
(5.8) imply that 

Sy(u/) = if*(u;)SxyM (5.9) 

These simple relations are very useful in understanding the response of 
linear systems to random inputs. 

For example, suppose that the system is an ideal band-pass filter 
with frequency response function as shown in figure 12. Thus, 

_ / 1 (wi < M < UJ2) 

n(u, ~\0 (Otherwise) 
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Chapter V Random Processes in Linear Systems 


H(co) 



Figure 12. Ideal band-pass filter. 


and all frequency components in the range (lji, 0 J 2 ) sure passed without 
amplification while all others are excluded. For this system, the 
relation between the input and output spectral densities is given by 
equation (5.7). Thus, 

E{Y 2 {t)} = f°° Sy (w) dui = f°° |tf(u/)| 2 S X (u/)dw 

J -OO J -OO 

/ u/2 

S x {u)du > 0 (5.10) 

and the total power in the output process can be seen to be just the 
power in the input process in the frequency band (cjx, ^ 2 )- Since 
equation (5.10) must be valid regardless of the values of u/j and W 2 , 
another property of the power spectral density i§ apparent. 

SxM > 0 (5.11) 

If this were not true, the output power could be negative in some band. 

It should be mentioned that the technique for spectral estimation 
employed in the old analog spectral analyzers, which passed the signal 
through a bank of fixed bandwidth filters, was based on equation (5.10). 
Although an ideal band- pass filter for a time signal is not physically 
realizable since its impulse response function 

h(t ) = — cos — tsm — - — t 

irt 2 2 

is not causal, very good approximations to an ideal band-pass filter can 
be constructed. 

53 Determination of Frequency Response Functions 

Equations (5.7) and (5.8) can be used to determine the frequency 
response function of a linear system in a simple manner. A frequency 
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S x (cj) 


1 co 

Figure 13. Power spectral density of white noise, 
response function is generally complex, that is, 

H(u) = 

producing both an amplitude and a phase change in the incoming 
signal. To determine this function, a random process can be input 
to the system and the input and output spectral densities measured. 
Then, if only the magnitude of the frequency response function is of 
interest, equation (5.7) yields 


l#MI = 


' 5y(u/) ' 

.S X M. 


1/2 


(5.12) 


as long as Sx(<*>) # 0. To avoid this and to simplify the calculation, 
the concept of white noise , whose power spectral density is a constant 
as shown in figure 13, is often employed in theoretical analysis. Unfor- 
tunately, such noise is not physically realizable since its total power. 


E{X 2 (t)} = 



So du 


is infinite. However, band-limited white noise such that 



(M < u /g) 

(Otherwise) 


can be generated. One is usually interested in the frequency response 
function only in some range of frequencies. Thus, if uq is larger than 
the values of in which one is interested, this band-limited white noise 
can conveniently be used experimentally to determine the frequency 
response function, since the denominator in equation (5.12) will never 
be zero in the frequency range of interest. * 

If knowledge of the phase of the frequency response function is 
also of importance, then the cross spectral density of the input and 
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Linear 
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System 
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N(t) 


•Y(t) 


Figure 14. A linear system with noise. 


output processes can be measured and the frequency response function 
determined from equation (5.8), that is, 


H(u) = 


Sxy M 


(5.13) 


These techniques, which can usually determine a frequency response 
function from a small amount of data, have all but replaced the old 
time-consuming w sine sweep” method in which the response to each 
individual frequency input was measured. 


5.4 The Coherence Function 

In recent years, the coherence function 


7 2 M 


Sx(w)Sy(w) 


(5.14) 


has come into use. This function has very interesting and useful 
properties. 

Suppose first that Y(t) is obtained by passing X(£) through some 
linear system. Then from equations (5.7) and (5.8), 


2 

1 [ ’ S x (u)\H(u)\lS x (u) 


= 1 


That is, if X(t) and Y(t) are related by equation (5.1), the coherence 
function is unity. 

However, suppose now that the output K(t) contains an additive 
part which is not linearly related to X(t), as shown in figure 14. The 
“noise” signal N(t) could be due to nonlinearities in the system and 
thus depend nonlinearly on the input signal X(t). Or it could be due 
to other inputs, or contaminating signals. In either case, the noise is 
assumed to be weakly stationary and uncorrelated with the input X(f)* 
The output Y(t) is then also weakly stationary and can be written 
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Introduction to Time Senes Analysis 



h(a)X(t - a) da 4- N(t) 


Assuming that the mean of T(t), if any, is removed is equivalent to 
assuming that E{N(t)} = 0 and Rxn( t ) = 0. Thus, 


Sy M = I# M| 2 S;c(w) + S.vM (5.15) 


where Spf{ui) is the power spectral density of the noise signal. Since.Y(t) 
and yV(£) are uncorrelated, from equation (5.6), the cross spectral 
density 

%(«) = H(u)S x {u) 

is unchanged. Thus, the coherence is 


7 2 M = 


|ff(u;)| 2 S£(u>) 


+ SxMSpfiu) 

1 


i + s *( w ) 


< l 


(5.16) 


That is, at those frequencies where the noise spectral density Sy{u) is 
nonzero, the coherence function is less than unity. Thus, a coherence 
value less than one indicates that the input and output processes are 
not totally linearly related. 

A further interpretation of the coherence function may be obtained 
by writing equation (5.15) as 


Sy M = Sy£,M + Sy{uj) 


where 

SylM = l#MI 2 S;*-(w) 

is that portion of the power spectral density of Y[t) which is linearly 
related to the input X(t). Then equation (5.16) becomes 

2/ \ _ SyiM 

5y(w) Sy(w) 

and the linearly related portion of Y(t) may be determined by the 
relation 

Syl(u) = *)f 2 (u j)Sy(uj) 

from the measured power spectral density 5y(w). 
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Chapter V Random Processes in Linear Systems 

If there is no linear relation between X(t) and Y(t), such that 
Rxy[ t ) = 0, then = 0 and *y 2 (u ;) = 0. Thus, the coherence 

function is useful in determining whether a linear relation exists between 
the input and output at any given frequency. It should be emphasized 
that the absence of a linear relation does not necessarily imply that 
there is no relation between X(£) and V'(f). 

The coherence function may be used in many ways. For example, 
suppose that a new composite material is developed and one wishes to 
determine whether it can be characterized as a linear system. The 
material could be excited at one point by a random signal and its 
response measured at another point. The coherence between the 
two signals would then indicate whether the material is linear in its 
behavior. Another use is to determine whether two time histories, which 
may have similar power spectral densities, are linearly dependent. For 
example, many people believe that sunspot activity produces effects 
observable here on Earth, such as climatic changes evidenced by crop 
yields. Data on the activity of sunspots going back to the year 1610 
exist 10 and their power spectral density exhibits a peak corresponding 
to a period of approximately 11 years. If one had data on some 
other phenomenon, such as wheat yields in Iowa, whose power spectral 
density also exhibited a peak showing an 11 year cycle, then one might 
suspect a linear relationship between the wheat yields and the sunspot 
activity. However, the similarity of the spectra might also be merely a 
coincidence. Coherence analysis of the two records over the same span 
of years would resolve the question. In one such test of which the author 
is familiar, coherence analysis showed no linear relation at the 11 year 
cycle, but did suggest a linear relation with a period near 80 years. 
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Chapter VI 

Estimation Theory 

This chapter begins examination of the problem of evaluating the 
various expressions developed in the previous chapters from data ob- 
tained in the real world. Most of these expressions involved ensemble 
averages over the entire ensemble of time histories that comprise the 
random process. This entire ensemble is never available for analysis 
in practical situations. Nor is the probability distribution function of 
the random process known. Thus, one must be content to estimate the 
quantities of interest, often on the basis of a single realization of the 
random process. 

6.1 Estimation of a Parameter by a Random Variable 

Consider an unknown parameter 6 which one wishes to estimate 
by a random variable 9. In this monograph, a hat over a quantity 
indicates an estimate. Why would one want to make such an estimate? 
Recall that the assumed data comprise a record of length T of a single 
sample function from a stationary random process X(£) as shown in 
figure 15. On the basis of this single sample function, it is desired 
to estimate parameters (i.e., moments) of the random process itself, 
such as mx, i?x( r )i and Sx(w)> which for fixed r and cj are constants. 
However, the estimates will be random variables, since they will depend 
on which particular sample function is used. 

The first requirement one would ordinarily desire in an estimate is 
that 

E{9} = 6 (6.1) 

where the expectation is over all possible values of the estimate (i.e., 
over all possible sample functions that comprise the random process) 
in order that the average value of the estimate would be the quantity 
to be estimated. An estimate which satisfies this requirement is said to 
be unbiased . When this is not true, it is possible to define 

Bias = E{6 } - 9 (6.2) 
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X(t) 


■ ■ywL/; 


t 


Figure 15. Finite length sample function. 

Bias is a systematic error due to the estimation technique employed. 

The variation of the estimate 9 about its mean value depends on 
which sample function is employed. In order to keep this uncertainty 
as small as possible/ one should choose the estimate so that it varies 
as little as possible about its mean value; that is minimum variance 
estimation is desired. This uncertainty is measured by the standard 
deviation of the estimate: 

Uncertainty = (£{[0 - £(0)] 2 })V2 , (6.3) 

Thus, one ordinarily attempts to choose an estimation technique that 
is without bias and whose uncertainty is as small as possible. 

6.2 Estimation of Mean 

Suppose that X(t) is a stationary random process and consider the 
estimate 



Since mx is a number assigned to each outcome of the experiment, it 
is a random variable. In addition, 

E{m x } = i jf E{X(t)}dt mOS&jJdtm m x 

and, thus, m\ is an unbiased estimate of the unknown mean m\ of 
the random process X(t). Further the variance of m\ is 
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£{mx-m x ) 2 } = £; WO-mx]* 

- £ { f! £ Will -”>xl *1 £ IWj) - 

= £ /„ *'L 


mx\ dt-i 


dti T x (tj - <i) 


m fJ- T rxt,) {'- ¥)* 


where the integral relation 
rT rT 


[ dty f dhF(t 2 -ti)m f F(r)(T-\r\)dr 
Jo Jo J-T 


(6.5) 


( 6 . 6 ) 


has been employed. In this equation, F(r) is any function of the variable 
r = t 2 — t\. This fundamental relation (eq. (6.6)) is obtained by 
employing the change of variables t\ = t and 1 2 = t + r and noting 
the region of integration as shown in figure 16. The integral over t may 
then be evaluated. Upon use of equation (6.5), the uncertainty in the 
estimate in equation (6.4) of the mean is seen to be 


Uncertainty = ^ J f x (r) (l - y ) dr 


T 1/2 
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Introduction to Time Series Analysts 


which can, in theory, be made as small as desired for a completely 
random process by obtaining more data (i.e., by letting T — oc) since 
lim r x (r) = 0. However, for T fixed, the uncertainty is fixed and 

r— oo 

nonzero. 

63 Estimation of Autocorrelation 

Consider the autocorrelation estimate defined by 

Rx(r) = =— — f X(t)X(t + r)dt (0 <t<T) (6.7) 

T -r J o 

Its expected value is 

[ T ~ T Rx(r) dt = R x (t) 

T - t Jq 

and thus, Rxi r ) is an unbiased estimate of Rx( r ) for 0 < r < T. 
Several points should be noted about this estimate: 

1. The estimate can be determined only out to a lag of r = T from a 
data record of length T. 

2. Equation (6.7) also provides an estimate for -T < r < 0 since the 
autocorrelation function is even. 

3. Less data contribute to the estimate as r increases, as can be seen 
from the limits of integration. Thus one would expect the variance 
of the estimate to increase with r. In other words, the estimate of 
R x (r) becomes more uncertain as r becomes larger. 

An equation for the variance of Rxi r ) involves the expected value 
of the product .of the values taken by the process at four different times 
and will not be included in this monograph. However, the variance 
can be shown to depend upon both T and r and to approach zero as 
T ^ oo. Thus, the uncertainty in this estimate can again be made as 
small as desired by obtaining more data. 

6.4 Estimation of Cross Correlation 

In a similar manner, the expressions 

Rxv(r) = f T " X(t)Y(t+r)dt (0 < r < T) (6.8) 

1 — T J 0 

and 

1 f T ~ r 

Ryx(t) = = / Y(t)X(t+r) dt (0 < r < T) (6.9) 

1 - T J 0 
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can be shown to be unbiased estimates of the cross correlations of two 
jointly weakly stationary random processes. Again, their variances 
depend on both T and r and approach zero as T — ► oo. Note that 
the cross correlations are not generally even functions of r. However, 
the values for negative lags can be obtained from the relations 

Rxy(~ t ) = R-y: y( t ) 

Ryx{~ t ) = ^xy( r ) 

6.5 A Test for Stationarity 

Because the assumption of stationarity is fundamental to most of the 
analysis in this monograph, it is important to derive a test to ascertain 
whether it is reasonable to assume that a particular time history is a 
sample function from a stationary random process. Fortunately, such 
a test is readily developed. 

Consider a sample function of length T from a random process X(£) 
and suppose that it is broken into Nq blocks of data of length Tg, as 
shown in figure 17. Then, an estimate of the mean 



and an estimate of the variance 



can be calculated for each block of data. A test for stationarity is 
then developed by recalling that the mean and variance of a stationary 
random process are constants. Thus, the estimates of the mean 
obtained from the different blocks of data should be equal, except for 
random variability. In addition, the estimates of the variance obtained 
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Introduction to Time Series Analysis 

from the different blocks of data should also be equal. If one wants to 
be sophisticated, there are various statistical tests which will allow one 
to accept or reject, with a certain level of confidence, the hypotheses 
that these estimates are all from an underlying random process with 
constant mean and variance. However, for practical purposes, if these 
estimates vary more than a few percent for “reasonably sized” blocks 
of data, one should be cautious in placing much faith in analyses based 
on stationarity. Here, “reasonable size” is a judgment, based on the 
frequency content of the data. 

It might be mentioned that the estimate of the variance (eq. (6.10)) 
is an example of a biased estimate. Its mean can be shown to be 

E |<7*} =<7 2 x -E {(mx-m.x) 2 } 

where the variance of the estimate of the mean is given by equa- 
tion (6.5). Thus, the mean value of the estimate of the variance is 
less than the actual variance. However, this bias does not affect the 
test for stationarity since all the variance estimates will be biased by 
the same amount if the process is stationary. 
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Chapter VII 


Estimation of Power Spectral Densities 

There are two techniques in common use today for estimation of 
power spectral densities: the Blackman- Tukey and finite Fourier trans- 
form techniques. There are also newer techniques, such as maximum 
entropy and maximum likelihood, which, though still subjects for re- 
search, appear to have certain advantages over the older techniques and 
may one day become standard. These newer techniques are discussed 
in chapter XII. 

7.1 The Blackman-Tukey Approach 

The first practical approach to the estimation of power spectral den- 
sities was developed by Blackman and Tukey. 1 Although this approach 
has been superseded as the standard by the finite Fourier transform 
approach, it is still valuable for certain applications, particularly when 
one is analyzing digital data and the number of data points is not a 
power of two. Further, it illustrates some of the difficulties in spectral 
estimation more clearly than does the finite Fourier transform. Thus, 
it will be discussed first. 

Recall that the power spectral density is defined as the Fourier 
transform of the autocorrelation function, that is, 

SxM = ^/I flx(r)e ‘ MTdr 

From equation (6.7), an estimate of the autocorrelation for |r| < T can 
be obtained as shown in figure 18. Thus, Blackman and Tukey define 
their estimate as the finite Fourier transform of this function, that is, 

Sx M = “ £ kx(r) t~ lWT dr (7.1) 
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v t > 



Figure 18. Estimate of autocorrelation. 


Uj(T) 

_Jl_ 


-T T 


T 


Figure 19- The boxcar function. 


In effect, the Blackman-Tukey approach assumes that Rx(t) = 0 for 
|r| > T. The first question, of course, is how good is this estimate? 
Introduce the boxcar function 

ur(r) = / 1 (M < T) 

' \ 0 (Otherwise) 

as shown in figure 19. With the use of this function, the estimate in 
equation (7.1) may be written as 

Sx M = ^ ut(t)R x {t) e~ lwr dr (7.2) 

with expectation 

E{SxM} = i- J^ UT{ r)R x (r) e -^dr (7.3) 

since R X (t) is an unbiased estimate of R X (t) for |r| < T. Now. 
equation (7.3) is the Fourier integral transform of the product of two 
functions. Thus it may be evaluated by the convolution theorem 
(eq. (2.8)). That is, since the Fourier transform of the autocorrelation 
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U j ( co ) 



Figure 20. The spectral window Ut{w), 
is the power spectral density, 

£{5xM} = f°° U T {uj-J)Sx(^)du' (7.4) 

where Ut{u), called a spectral window , is the Fourier transform of the 
boxcar function and is given by 

U t (uj) = [ u T (r)e- XUT dr = — sinc(u/T) (7.5) 

2 7T 7-00 * 


where 


smc x = 


smx 


(7.6) 


is the sine function, which has the property that lim(sincx) = 1. 

x — *0 

Although some authors include a ir in the definition of the sine function, 
equation (7.6) is preferable for time series analysis. The spectral 
window Ut(u) is shown in figure 20. Its main lobe has height T/n 
and width 2 7r/7\ and its integral is unity since by definition 



U T {u) e luJT du 


and thus 


u t(0) = I 


TOO 

/ u t M 


du 


Therefore, as T — * oo, the spectral window has the characteristics of a 
delta function, in which case the integral given in equation (7.4) would 
be Sx(u) and the estimate would be unbiased. 

However, for finite 7\ the spectral window {/^(u;) is not a delta 
function; it has a main lobe of finite width and both positive and 
negative side lobes. Thus Sx(u/) is not an unbiased estimate of the 
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Introduction to Time Series Analysis 


power spectral density Sx(^)« It can be seen from equation (7.4) 
to be biased because its expectation is a convolution of the actual 
power spectral density with the spectral window Ut( u/). Further, the 
finite width of this window causes a major problem with the frequency 
resolution of the power spectral density, as will be discussed later in 
this chapter. There are also other unfortunate aspects to this estimation 
technique, such as the negative estimates that can be produced at times 
by the side lobes of the window and the high variability of the estimate, 
as will be discussed in chapter VIII. 

It should be noted, however, that even though this estimate is biased, 
it is still power preserving in the sense that, from equation (7.3), 


E | 5 ,y(w)} du = j dTu T {r)R x {T) ^ J 

= f dTu T {r)R x {T) 6 {T) 

J -oo 

= u T (0)tfx(0) = £{* 2 (*)} 


du 


since ut’(O) = 1. Thus, the mean value of the integral of the spectral 
estimate is the same as the integral of the actual power spectral density, 
the total power of the process. 


7.2 Windows 

In an attempt to improve their estimation technique, Blackman and 
Tukey developed a new class of estimates 

5 X (w) = 2 ^ J ^ “( r )£x(r) dr (7.7) 


where u(r) is a real function called the lag window corresponding to the 
spectral window 


uM= ^IZ> u{T)e ~ tuiTdT (7 - 8) 

Since equations (7.7) and (7.2) are identical in form, these new esti- 
mates have similar mathematical properties to those discussed earlier. 
Blackman and Tukey ’s idea was to pick a lag window that led to a spec- 
tral window which has smaller side lobes or other desirable properties 
and which reduces the variability of the estimate. 
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Chapter VII Estimation of Power Spectral Densities 


U(T) 



Figure 21. Arbitrary lag window. 


All acceptable lag windows must satisfy three requirements: 

L u( 0) = 1, to preserve power 

2. u(r) = u(— r) so that Sx{u) is real 

3. u(r) = 0 for |r| > T m where T m < T, so that unavailable data are 
not required 

Thus, all lag windows have a shape similar to that shown in figure 21. 
Since any function that satisfies these three requirements is an even 
function for — T m < r < T m , it can be expanded in a Fourier cosine 
series in that region: 

«(r) = T + f>"C03^ (|r| < T m ) (7.9) 

1 n«l lm 

where 

“(0) = Y + H an = 1 

Thus, equation (7.9) represents the whole family of windows for different 
choices of the coefficients a n . The corresponding spectral window is 
given by 

= ~ /_ u(r) e~ luJT dr 
T 00 

= a n sinc(o;r m - titt) (7.10) 

n=— oo 

Thus, the arbitrary spectral window consists of a sum of sine functions 
occurring at the frequencies u; = mr/T m as shown in figure 22. 
Generally, the number of nonzero a n ’s determines the effective width 
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Figure 22. Arbitrary spectral window. 

of the main lobe of the window. However, the positive lobes of one sine 
function tend to cancel the negative lobes of another. 


Some classical windows. Some lag windows have been so widely 
used that they are worthy of discussion: 

1. Boxcar (oq = 2, a* = 0 for |n| > 0): 

u(r) = / 1 (M < T m ) 

\ 0 (Otherwise) 

This window, which was the original Blackman- Tukey window, is shown 
in figure 19 and its corresponding spectral window is displayed in 
figure 20. 

2. Hanning (oq = 1, ai = 1/2, a* = 0 for |n| > 1): 


U ( T ) - / ^( 1 + cos 7^) (M<T m ) 
1 0 (Otherwise) 


This window was named for the Austrian meteorologist Julius von 
Hann 1 and is probably the most widely used window for Blackman- 
Tukey spectral estimation. 

3. Hamming (ao = 1.08, a\ = 0.46, a n = 0 for |n| > 1): 


, 1= f 0.54 + 0.46 cos (|r| < T m ) 

\ 0 (Otherwise) 

This window was named for R. W. Hamming. 1 Although it minimizes 
the height of the side lobes for a two-term series, it has a discontinuity 
at | r) = T m which is not present in the Hanning window. However, 
note that the Hamming and Hanning windows differ negligibly. 
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4. Cosine bell: 



(M < T m ) 

(Otherwise) 


Although this window corresponds to a complicated set of a n ’s, its 
transform can be written 




Tm COS CjT m 

2 (t2/4-u;27^) 


Figure 23 presents a comparison of the cosine bell and Hanning lag 
windows and figure 24 presents a comparison of the corresponding 
spectral windows. 

5. Bartlett: 

u(r) = ( 1 “ fi d T l < 

[ 0 (Otherwise) 

This window was named for M. S. Bartlett. 11 Again it corresponds to 
a complicated set of a n ’s. However, its transform may be written 


rrr ^ . 2 ( {jJ Tm\ 

t/H = — smc ( — j 

since it is produced by the convolution of two boxcar windows. As 
can be seen, the Bartlett spectral window has no negative side lobes. 
However, its main lobe is very broad. 


Cross spectral densities. With the Blackman- Tukey approach, 
similar estimates of the cross spectral densities may be developed. For 
example, from equations (6.8) and (6.9), the cross correlation Rxy( t ) 
may be estimated for |r| < T. Then 

SxyM = 2 ~ J u ( t )Rxy ( r ) e -lu/r dr (7.11) 

yields an estimate of the cross power spectral density. 

73 The Finite Fourier Transform Approach 

The technique for power spectral density estimation in most common 
use today is based on the Finite Fourier Transform of the data rather 
than on the transform of the autocorrelation function. Historically, 
this technique was developed first and produced what was called the 
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Figure 23. Comparison of cosine bell and Hanning lag windows. 
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Chapter VII Estimation of Power Spectral Densities 

periodogram. 11 However, it waa only with the introduction of the Fast 
Fourier Transform (to be discussed in chapter X) that the technique 
became practical for large amounts of data. 

From a single sample function of length T of a random process 
X(t), it is clear that the Fourier integral transform of X{t) cannot be 
computed because the data are not known for infinite time. However, 
the finite Fourier transform 

X T (u) = ± £ X(t) e - Mt dt (7.12) 

can be calculated. Further 


£{XfM*rM} = -Jy y dt\ f dt 7 Rx(t7-t 

- 5^5: j \r\R x ( r) t-'“ r dr (7.13) 


where equation (6.6) has been employed. Now, if X(t) is a completely 
random process, the second integral in equation (7.13) is finite. Thus, 
one finds that 

r lim ) y£{|X T M| 2 } = 5 x (u;) (7.14) 

since the second integral is driven to zero by the T~ l multiplier and 
the first integral becomes the power spectral density. 

Based on this relation, the class of power spectral estimates 

S*( W ) = W s |X F (u,)| 2 (7-15) 

may be introduced where 

X F {uj) = ^ d(t)X(t) e -,wt dt (7.16) 

is a Fourier transform of the data as seen through the window function 
d(£). The data window is a real function with the property that d(t) = 0 
for t < 0 and t > T so that unavailable data are not required. The 
correction factor W$ is to be determined. 

65 



v v 

L L 


H 1M1 li 11 II. li L II U li II U K 



Introduction to Time Scries Analysis 

How good is such an estimate? Again, for u ; fixed, this estimate is 
a random variable with mean 

£ {^ (w) }*S JZ> dt 1 /_* * 2^(fiW2)^v(t2-ii)e‘ ,u/(tJ " ei) 
= hrj*t /_« " r) dt rx(t) e ~ x “ T dr (7i7) 

upon setting = £ — r and £2 = *• Note that equation (7.17) is 
analogous to equation (7.3) with the equivalent lag window 

u(r) = ^ J d(t)d(t-r)dt (7.18) 

which is a convolution of the data window with itself. Thus, the finite 
Fourier transform and Blackman- Tukey techniques are mathematically 
equivalent in their expectations as long as equation (7.18) satisfies the 
conditions for a lag window. The first condition, that u(0) = 1, for 
power preservation requires that 

« (°) = d 2 (t) dt = 1 

In other words, the window correction factor must be 



Thus, the estimate in equation (7.15) becomes 


(7.19) 

The second condition, that u(r) be an even function of r, is identically 
satisfied since 

«(- r ) = ^ J ^d{t)d(t+r)dt 

= d (t'-r)d(t')d' = u(T) 

upon setting t' = t + r. The third condition, that u(r) = 0 for 
|r| > T is also identically satisfied since u(r) is the convolution of 
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two data windows that are nonzero only in the range (0 ,T). Thus, 
equation (7.18) does satisfy the conditions for a lag window, and it 
can be seen that the finite Fourier transform and Blackman-Tukey 
spectral estimation techniques are equivalent in their expectations. 
That is, finite Fourier transform estimation with a data window d(t) 
corresponds to Blackman-Tukey estimation with the lag window given 
by equation (7,18). A converse statement is also valid. 

One comforting property of the finite Fourier transform technique 
can be seen by introducing the Fourier transform of the data window 

D{u) m 1- f°° d(t) e~' wt dt (7.20) 

27T J —oo 

where 

d{t) = f°° D{u) e^dui 

J -oc 

Then, the equivalent spectral window is given by the Fourier transform 
of equation (7.18). Note that the convolution theorem (eq. (2.9)) 
cannot be applied to equation (7.18) directly since the convolution in 
equation (7.18) depends on t — r rather than on r — t. However, the 
Fourier transform of /(— t) is F(-uj). Thus, by equation (2.9), the 
equivalent spectral window is 

U{u) = f u(r) e*** 7 dr — W$D(u)D{—<jj) 

2?t J — oo 

Further, since d(t) is real, Z?(— uj) — D*(u). Therefore 

U{u) = ^sPMI 2 (7.21) 

and it is seen that the equivalent spectral window is always non- 
negative. Thus, the finite Fourier transform technique cannot yield 
negative spectral estimates. 

Any of the lag windows introduced previously may be employed as 
data windows by setting 

d(t) m u(2*-T) (7.22) 

which merely amounts to shifting the range from (— 7\ T ) to (0, T). The 
cosine bell window and the boxcar window modified by cosine tapers 12 
on each end are probably the most widely used windows in finite Fourier 
transform estimation. 
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Cross spectral density estimation. Again, similar estimates for the 
cross spectral densities may be developed. For example, 

SatM = WsXrMYfiuj) (7.23) 

where Yp(<jj) is given by equation (7.16) with X(t) replaced by Y(t). 


Autocorrelation and cross correlation estimates . If the power spectral 
density is estimated by the finite Fourier transform, the autocorrelation 
could then be estimated by Fourier transformation, that is, 




«*(«)«* 


Again, the question is how good is this estimate? Its expected value is 

E { Rx(r ) } = |_~ £ { S*M } e twT du 

where, by equation (7.17), 

E (Sx(u,)} = ± u(t)R x (t) e~' UT dr 

and ti(r) is given by equation (7.18). Thus, 

£{**(')} = u(r)£x(r) 

and the estimate in equation (7.24) is seen to be biased. Even if the 
boxcar data window is employed, 

U ( T ) = f d(t)d(t-r) dt = \~Y (|t| < T) 

which is a Bartlett lag window, and the estimate is still biased: 

£{£*(r)} = (l-^)**(r) 

and in order to make the estimate unbiased, it is necessary to define a 
new estimate 
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where Wr is the window correction factor 



Similar estimates may be developed for the cross correlations, for 
example, 


Rxy( t ) = W R [ $Xy(u) e luJT <Lj 
j — oo 


(7.26) 


7.4 Frequency Resolution 

The major problem introduced by a finite data record length is 
that of frequency resolution. Since the finite Fourier transform and 
Blackman-Tukey spectral estimation techniques have an equivalent 
bias, this problem occurs in both techniques. Thus, the problem 
of frequency resolution will be analyzed for the simplest case of a 
Blackman-Tukey estimate with a boxcar lag window. 

If a random process X(t) consists only of a single sinusoid with 
amplitude A and random phase 0, that is, 

X(t) = i4cos(cJi* 4* 0) 
then, its autocorrelation is 


A 2 

Rx( t ) = y cosw i t 


with power spectral density 

A 2 

Sx( u ) = ~ [$(u/— a/i) 4 <5 (u;-Kji)] (7.27) 

However, if equation (7.27) is substituted in equation (7.4), the expected 
value of the spectral estimate is 


E |s^(u;)| — -j- [Ut{uj —uji ) 4 Ut( w+cji)] (7.28) 

as shown in figure 25. Thus, the spectral estimate consists of two 
reproductions of the spectral window. Further, since the window has 
negative side lobes, the estimate is negative at certain frequencies even 
though the power spectral density, by definition, must be non-negative! 
It is only in the limit as T — oo that non-negative estimates consisting 
of two “spikes” at u = are obtained. 
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E^co)) 



Now, suppose that the signal consists of two equal-amplitude sinu- 
soids at frequencies uj\ and u/2> that is, 


X{t) = Acos(wit -I- <t> i) 4- i4cos(cJ2^ 4 * 02) 


with random phase angles. Then, the expected value of the spectral 
estimate is 




— — wi) + u )\ ) + C^7"(u/+W2)j (T.29) 


which is a sum of four window functions, two centered at <jj\ and u /2 and 
the other two centered at — uj\ and Depending on the frequency 
separation Aa / = (uj 2 - u/i) and the characteristics of the spectral 
window, it may or may not be possible to determine from the estimate 
that two separate frequencies are present. In the analysis that follows, 
it is sufficient to consider only the window reproductions at the positive 
frequencies u/i and u/ 2 . Three cases are possible as shown in figure 26. In 
figure 26(a), the two frequencies are so close together that the sum of the 
two window functions that represent them actually peaks at the mean 
frequency Q = (u/i 4- u^)/2, and no peak is visible at the frequencies 
u/i and CJ2- In figure 26(b), the frequencies are better separated and 
peaks at uj\ and UJ 2 are visible. However, the two window functions 
merge into one another. Finally, in figure 26(c), the frequencies are 
sufficiently separated that the two window functions are distinct. 

A criterion can be developed that will determine which of these three 
conditions is present for the spectral window given by equation (7.5) 
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E (S x (co)! 



E (S^co)} 



E (S^co)} 



(c) Fully resolved components. 

Figure 26. Spectral estimate of signal containing two sinusoidal components. 


and various separations of the two frequencies. Define the sum of the 
two window functions as 


f(ijj) as Wi) + — u/?) 


(w - w) sin(w - w)Tcoa ( ) T - ( ^ ) ct»(u - (S)T sin ( ) T 


(w - w) 2 - 


(7.30) 
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Figure 27. Peak amplitude functions. 

where Aw = (w 2 — wi). The value of this function at the mean frequency 
w = (<jj\ + W 2)/2 is 


= ~ sinc (^) r ( 7 - 31 ) 

while its value at the frequencies w* and W 2 can be obtained from a 
limiting process as 



.1 

1 4* sine 1 

( T cos I 


V 2 / 

It 

\ 

V 2 J \ 

>. 2 / J 


In terms of these two values, the criterion for resolution may be given 
as 


Unresolved: f(Q) > f (q ± 

Partially resolved: /(w±4j*)>/(w)> 0 

Fully resolved: /(w) =0 

Figure 27 is a plot of equations (7.31) and (7.32) as a function of the 
separation of the two frequencies Aw. From this figure, it can be seen 
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that partial resolution requires 

IA . 4.2 

|Acj| > — 

while full resolution requires the two equal-amplitude sinusoids to be 
separated in frequency by 


| Au/| = \uJ2 — cJi| 



(7.33) 


where the first zero of f(Q) has been utilized. 

This analysis of the resolution problem has assumed that the two 
sinusoids were of equal amplitude. Clearly, the problem of resolving 
two sinusoids becomes even more difficult if the sinusoids have unequal 
amplitudes. 

The resolution problem also places a lower limit on the frequencies 
that can be observed in a record of length T. Any frequency u lower 
than 2 tt/T\ or 




(7.34) 


cannot be differentiated from zero frequency. Thus, frequencies / below 
1/T appear as a nonzero mean or linear trend in the data. 


Bias . If the signal X(t) is a completely random process, the 

preceding discussion of frequency resolution may, on occasion, make 
matters look worse than they really are. Recalling the definition of the 
boxcar function and the fact that the autocorrelation is even, it can be 
seen from equation (7.3) that 


E = ~ J q u t{ t )Rx( t ) cos (jjt dr 


1 

= Sx{u) / Rx{t) cos ur dr 

7T JT 


and thus 


Bias = E |5x(<^) j — «?x(<^) = — ^ J Rxi r ) cosojt dr 


(7.35) 


Therefore, the bias depends on the values of the autocorrelation at 
lags greater than the record length. If X(t) contains periodic signals 
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Introduction to Time Series Analysis 

or very low frequency components, these autocorrelation values are 
nonzero and the preceding discussion is relevant. However, if -Y(t) 
is a completely random process, the autocorrelation may well become 
essentially zero before the end of the record length, in which case the 
estimate is unbiased. It is surprising that this result has received very 
little attention in the literature, as the conditions for its validity are 
often fulfilled in practice. 

When the estimate is biased, a better understanding of the bias can 
be obtained by noting that equation (7.4) may also be written 

E {$*(«)} = j°° U T (\)S x {u - X) dX (7.36) 

Because Ux(\) is highly peaked about A = 0, most of the value of 
the integral comes from this region. The power spectral density may 
be expanded in a Taylor series about A = 0, with primes indicating 
derivatives with respect to u 


Sx(u— A) — Sx(u) ~ A S f x (u) + -^Sxiuf) — • • • 

and then equation (7.36) can be approximated by integration over only 
the main lobe of the window. The result is 


£{SxM}* J*^ T U T (\)S x (uj-\)d\ 

r n , r*/T 

as S x (uj) / U T { A) dX - S'v-M / XU T {X) dX 

J-ir/T J-ir/T 


* S X M + 


since the first integral is nearly unity (i.e., ur(0) = 1) and the second 
integral vanishes because the window is an even function. Thus, smother 
expression for the bias may be obtained: 

Bias = E {$xM } - S*M * (7.37) 
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Figure 28. Smoothing produced by estimation. 


It can be seen that 

£{s x (o,)}> 5 y M (If S'x( u ) > 0) 

< Sx( U/) (If S%(u) < 0) 

That is, the estimate will be too low at maxima of Sx(cj) and too high 
at minima and will therefore provide a smoothing of the spectral density 
as shown in figure 28. This result is to be expected, since the spectral 
window acts as a moving-average low-pass filter (to be discussed in 
chapter XI) and suggests a method for attempting to reduce the bias 
as discussed below. 

Prewhitening and postdarkening. One technique that is sometimes 
used to try to remove the bias caused by estimation is called pre whiten- 
ing and postdarkening. Because the bias is proportional to the second 
derivative of the actual power spectral density, the idea is that if the 
actual power spectral density were flat or even linear in u/, the bias 
would be zero. 

Suppose that the signal X(t) has a power spectral density with peaks 
as shown in figure 29. If one could design a filter that has valleys at 
the frequencies at which the spectrum has peaks, as shown in figure 30, 
and then pass the signal through the filter, the power spectral density 
of the output signal Y(t), given by 

Sy(w) = 

would be nearly flat as shown in figure 31. Thus, an estimate of the 
power spectral density of the random process Y (£) should have little 
bias. Such a technique is called prewhitening. An estimate of the 
power spectral density of the original signal X(£) is then obtained by 
reversing the process, that is, 




SyM 

l#MI 2 
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|Hto)l 

' n \ 

1 1 — - co 

«| «2 

Figure 30. Prewhitening filter. 


S Y (co) 



Figure 31. Prewhitened power spectral density. 

This technique is called postdarkening. 

To actually implement the technique, one would first obtain a raw 
estimate S,v(u/)» use this to design the filter, and then carry out the 
procedure. It could even be done iteratively if necessary. The process 
for designing such a filter will be discussed in chapter XI. 
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Chapter VIII 


Uncertainty in Power Spectral Estimates 

The spectral estimates discussed in the previous chapter also vary 
about their mean values, depending on which particular sample function 
of the random process is employed. This uncertainty can be very large 
(as will be seen), and thus means for reducing it must be devised. The 
two common techniques for spectral estimation use completely different 
approaches to solve this problem. 

8.1 Understanding of Uncertainty 

An understanding of uncertainty in spectral estimates can be gained 
from one case which can be worked out completely, because of the 
assumed normality of the random process. Suppose that X(t) is a 
stationary, normal random process with mean zero and variance al. 
Then by analogy to equation (7.14), consider the random variable (ter 
u / fixed) 

Z{uj) = lim ^|X T (w)| 2 (8.1) 

i — *00 1 

whose expected value is the power spectral density Sx(u/). Now, from 
equation (7.12), Z(u) may be written 

Z(u) = X?(u) + X$(u>) (8.2) 

where 

i r T 

Xi(w) = lim / X(t) cosust dt 

T — oo v'2jtJ’ Jo 

i r T 

X 2 (ui) = lim / X(t)ainuitdt 

Since the operations on the normal random process X(t) represented 
by these expressions are linear, Xi(cj) and ^ 2 ( 0 ;) are normal random 
processes, or for u fixed, normal random variables. (See chapter III.) 
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Further, it can be shown by taking expectations of equations (8.3) 
that 

E{X i(cj)} = E{X 2 (u)} = 0 (8.4a) 

and 

E (x?(w)} = E {x 2 2 (u>)} = (8.4b) 

Thus, Xi(w) and X 2 (w) are identically distributed. In addition, 

E {Xi{u)X 2 {u)} =0 

and so they are uncorrelated; that is, for all u the correlation coefficient 
of these two random variables is 

E{X x {u)X 2 ( u/)} n 

- 2 -o 

where a* = Sx{w)/ 2. Therefore their joint density function (see 
chapter III) factors, that is, 

fx iX 3 ( x 1i x 2) = fxi(^l)fx 2 {^2) 

Thus, Xi(u;) and X 2 (w) are independent random variables and Z(uj), 
as given by equation (8.2), is the sum of squares of two independent , 
identically distributed normal random variables. 

8*2 Application of the Chi-Square Random Variable to Spectral 
Estimation 

The random variable 


Z(u) = lim ^|X t M| 2 
r— oo i 

is, except for the limiting process, the power spectral density estimate 
given by equation (7.19) with a boxcar data window. Further, it has 
been shown that Z(u/) is a sum of squares of two independent, normal 
random variables X\ and X 2 with means zero and variances Sx{^)/2. 
Thus, the random variable 
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Chapter VIII Uncertainty in Power Spectral Estimates 

is a chi-square random variable with two degrees of freedom, as dis- 
cussed in chapter III, and its mean is 

E{Zq{uj)} = 2 

Figure 32 is a plot of the variation of a chi-square random variable 
about its mean as a function of degrees of freedom k. Eighty percent 
of the values taken by the random variable will lie between the bounds 
shown. Thus, these bounds represent the 80-percent confidence limits 
on the random variable. Although other limits could have been plotted, 
these limits have become standard for use in spectral estimation. 

Since Zo(w) is a chi-square random variable with two degrees of 
freedom, it can be seen from figure 32 that 

P (o.i < < 2.3) = P {0.1 < < 2.3} 

l 2 J \ S X M / 

= P {O.lSxM < Z{ w) < 2.3 S x (w)} - 0.8 (8.5) 

which says that if Z(ut) is viewed as a spectral estimate, 80 percent of 
the time it will lie between 10 percent and 230 percent of the actual 
power spectral density. Clearly, this uncertainty is unacceptably large. 


8 J Block Average 


The finite Fourier transform spectral estimate given by equa- 
tion (7.19) is essentially equal to Z(u) and thus can be expected to 
have similar wide variability. Therefore, all spectral estimates by equa- 
tion (7.19) will be assumed to be essentially chi-square random variables 
with two degrees of freedom. This assumption will be employed for all 
random data , even when the underlying random process is not known 
to be Gaussian. Since the Gaussian model fits many real world phe- 
nomena, such an assumption may not be unreasonable, especially if 
one requires only a relative evaluation of uncertainty of one spectral 
estimate with respect to another. 

In order to obtain less variable estimates, suppose that the data 
are broken into Nq blocks of length Tq such that NqTq = T, 
as shown in figure 17. Then a technique similar to the test for 
stationarity in chapter VI can be employed. A spectral estimate S\{j) 
for j = 1,2, ...,Nq can be made over each block of data. Then, 
assuming independence of the blocks, each estimate is a chi-square 
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Figure 32. Variation of chi-square random variable. 
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Chapter VIII Uncertainty in Power Spectral Estimates 


random variable with two degrees of freedom and their average 
- iV s 

is essentially a chi-square random variable with degrees of freedom k 
given by 

k = 2 N b 

As can be seen in figure 32, this greatly reduces the uncertainty of the 
estimate. In fact, if a(k) is the left bound and 6(fc) the right bound in 
figure 32, it can be seen by analogy with equation (8.5) that 


P ja(Jfe) < < b(k) | = 0.8 

or, more meaningfully, 

p {W >Sx(w)> W} =0 - 8 (8 - 6) 

For instance, if the data are broken into 50 blocks, then a(100) — 0.82 
and 6(100) = 1.18. Thus, 

p{l.225xM > 5*(w) > 0.855x(w)} = 0.8 

or 80 percent of the time the actual spectral density will lie between 85 
and 122 percent of the spectral estimate. 

Of course, this reduction in variability has not been achieved without 
cost. Recall that from equation (7/33), the bandwidth of the spectral 
estimate is given by 



or A/ = ^ 

j T 


If the data are broken into blocks, the effective data length is no longer 
T, but Tq. Thus, the effective bandwidth of the estimate A / has 
increased to 


A/ = 


1 

T B 
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R.(,> 



Figure 33. Utilized portion of autocorrelation estimate. 


That is, in reducing the variability, the resolution has also been reduced. 
Writing k = 2 N& = 2T/Tq yields the fundamental relation 

k = 2A fT (8.7) 

That is, the degrees of freedom are equal to twice the bandwidth in 
hertz times the data length. Thus, for T fixed, a tradeoff dilemma. 


Reduced variability «=> Reduced resolution 


is apparent. The only way out of this dilemma is to obtain more data 
(if possible), that is, to increase T. 

8.4 Uncertainty Analysis for the Blackman-Tukey Technique 

Recall* that the Blackman-Tukey spectral estimate is 

SxM “ ^ f u ( T )^x( T ) e~ XUT dr 

where the lag window u(r) = 0 for |r| > T m . By a lengthy analysis, 
again assuming normality of the data, Blackman and Tukey 1 were able 
to show that this estimate can also be considered a chi-square random 
variable with degrees of freedom 



where T m is the half-width of the lag window. Thus, if the maximum 
width T m = T is employed, again the estimate has two degrees of 
freedom. The degrees of freedom are increased by taking T m < T. 

Since the autocorrelation can be estimated out to |r| = T. taking 
T m < T leads to the disquieting result that variability is decreased by 
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Chapter VIII Uncertainty in Power Spectral Estimates 

not using data at large lag values r, as shown in figure 33. This result is 
not so surprising when one recalls that less data went into the estimate 
of the autocorrelation at large lag values (see eq. (6.7)), thus making 
the variability of the autocorrelation estimate itself much higher at large 
lags. 

A similar tradeoff between resolution and variability is seen here as 
well, since the bandwidth of this estimate will now be A/ = 1/T m . 
Thus, again 


This relationship yields the number of degrees of freedom regardless of 
which estimation technique is employed . 
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Chapter IX 

Digital Time Series Analysis 


Most time series analysis is now done with the use of digital data, 
that is, samples of the random process taken at discrete instants of time. 
Digital computers are, of course, constrained to such data. However, 
even stand-alone spectral analyzers usually work with digital data now. 

The use of sampled data further complicates the estimation of the 
various statistical quantities of interest, as will be seen. Usually, the 
data are assumed to be sampled at equal intervals of time At, or at a 
sampling rate of 1/At samples per second. However, in recent years, 
techniques for analyzing data taken at random time intervals have been 
developed, as will be discussed in chapter XII. 


9.1 Shannon’s Sampling Theorem 

Much of the understanding of the analysis of equally spaced sampled 
data is* based on a marvelous result usually credited to Shannon 13 
although its origin is actually much older 14 . An interesting historical 
aspect of Shannon’s work is that, although it was submitted to the 
journal in which it appeared in 1940, it was not published until 1949, 
apparently having been caught up in the secrecy surrounding the war 
effort. 

Suppose that X(t) is a stationary random process and that sampled 
data X(nAt) exist for — oo < n < oo. On the basis of these data, it is 
desired to estimate the values taken by the random process at all times, 
that is, 

oo 

X(t)= £ a n (t)X(nAt) (9.1) 

n=-oo 

In other words, one wishes to interpolate between the data points to 
reconstruct the entire time history. An equivalent interpretation of 
equation (9.1) is that it is an attempt to expand X(£) in terms of a set 
of special basis functions a n (t ). The data X(nAf) are the coefficients 
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of the basis functions. The basis functions a n (t) are chosen such that 
the mean square error between X(£) and its estimate X(t), 

?=£{[*(0-X(i)] 2 } (9.2) 

is minimized. Since 


X(t)~ *n(t)X{nto) 


n»— oo 

for t fixed, the minimum occurs when 


00 


1 2 


a? 

da e (t) 




X(t)- J2 On^XinAt) 

n*— oo 


X{tAt) 


)" 


or 


Rx(t-tAt)~ 22 OnWx[(n-t)At}=0 


for — oo < l < oo. 
Now, recall that 


(9.3) 

(9.4) 


Rx(r)= r SxMe^du 

J— OO 

and thus, equation (9.4) becomes 


/■oo . 

oo 

/ S x ( U )e-^ 

e* w£ - 2 On(t)c" nAt 

y —oo 

n»— oo 

Now, comparing 

E a„(Oe ,wnAt 


n=— oo 


(9.5) 


with equation (2.4) for t fixed, it can be seen that this series is the 
Fourier series for a function f{ui) that is periodic with period 2 tt/AI. 
Thus, for — 7 t/A t < u j < 7r/A £, the function e luft may be represented 
exactly by the series 


e' ut = jr 
n*-oo 
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where 


an(<) = 


rir/St 

— / e »(t fa -s s\nc(<jj c t - nir) 

2?r J-Tr/bt 


and u/ c = it /A t. 

Now, suppose $x(w) — 0 for Ml > u>c* Then equation (9.5) is 
satisfied because Sx(<*>) causes the integral to be zero for |ur| > u/ c and 
the term in brackets causes the integral to be zero for M| < ur c . Thus, 
the best estimate is 


00 

X(£) = ^ X(nAt) sinc(u; c t — nir) 

n=-oo 

The mean square error in this estimate is 


e 2 = £ j(A’-x) 2 } = E{X 2 -2XX + X 2 } = E { (x - X) x} 
since by equation (9.3), |x 2 | = E j. Thus, 


H 


X(t)- £ a n (t)X(nAt) 


n=— oo 


X{t) 


= tfx(O) - J2 a n (t)R x (nAt - t) = 0 


n=— oo 


as can be seen from equation (9.4) by letting t = £At. Therefore 
X(t) = X(t)\ 

This result shows that, if Sx(u) = 0 for |u/| > u/ c , then a realization 
of the random process X(t) is perfectly reproduced from the sampled 
data X(nAt) by 


oo 

X(t) = ^2 X(nAt) sinc(u/ c J - n7r) 

n=— oo 


(9.6) 


This fundamental result is used in secure communications, long distance 
telephone transmission, digital music systems, and a host of other 
applications. 
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Although the derivation given here was for a stationary random 
process, the result also holds for deterministic and transient functions 
as long as their frequency content is limited to |w| < w c , as can easily 
be checked in two simple cases: 

Case I 

Let t — (At. Then 

sinc(oj c lAt — nir) = sinc[(£ — n)i r] = 6( n 

where fy >n is the Kronecker delta function, which is unity when i — n 
and zero otherwise. Thus, 


OO 

X{(A t)= X(nAt)6 (tn = X{(At) 

n=— oo 


and the data are reproduced, regardless of the properties of X(t). 
Case II 

Consider the deterministic, transient function with amplitude A. 
X(t) = Asinc(u/ C t) 


Then 

X(nAt) s= i4sinc(w c nAt) = Asinc(nir) = A<5 n ,o 
and equation (9.6) 

00 

X(t) s= ^ A6 n? osinc(u/ c t - nx) = Asinc(u;ct) 
oo 

reproduces the time history. 


9.2 The Nyquist Frequency and Aliasing 

The cutoff frequency 



1 

2 At 


is called the Nyquist frequency and is the highest frequency that can be 
reproduced from data sampled at equal intervals At. To see this, suppose 
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Figure 34. Illustration of aliasing. 


that X(t) is a sinusoid of frequency uq > c j c , that is, 

X(t) = Acosuqt 

as shown in figure 34 where the dots represent samples. From the 
sampled data, the sinusoid of frequency uq > u c is seen to be indistin- 
guishable from a sinusoid of lower frequency Further, f c = 1/2 At 
corresponds to a sinusoid with period l// c = 2 At. A sinusoid of this 
frequency would be sampled only twice per period while cosu/ 0 £ is seen 
to be sampled more than twice per period. Thus, u> a < and the 
sinusoid of frequency greater than u/ c cannot be distinguished from a 
sinusoid of frequency less than u c . Mathematically, consider the most 
often encountered case where uq is only slightly larger than u c \ that is, 
uq = u/ c + ui where oji < u c . Then 

X(nA£) = AcosuqnAf = Acos(u; c 
= A cos (mr 4* u;/nAt) 

= A(cosn7r cos u in At — sin n7r sin u\n At) 

= Acos(nir — u;/nA£) 

= Acos(cj c — ui)nAt = Acosu a nAt 

where jj a = uj c — u Thus, the frequency uq = ui c + u/j is indistin- 
guishable in the sampled data from the frequency u a = u/ c — u/j. This 
phenomenon is called aliasing , because the frequency uq goes by the 
new name, or alias, u a . Aliasing is the major problem introduced by 
the use of (equally spaced) sampled data. 

The presence of this phenomenon suggests that if one wants to 
analyze data having a maximum frequency of / max , one must use a 
sampling rate 
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of at least twice the highest frequency in the data. Such data are 
said to be sufficiently sampled. To use a much higher sampling rate 
is wasteful, since Shannon’s sampling theorem implies that there is no 
more information to be gained from the data. However, in the real 
world, where few signals are ever truly band limited, some increase is 
advisable. A 2.5 /max sampling rate has been found to give excellent 
results 12 in most applications. 

It should be mentioned that oversampling is sometimes recom- 
mended 15 to give more freedom in the reconstruction of sampled 
data. If the data are sampled at a rate greater than twice the 
highest frequency, then Sx(w) approaches zero at some frequency 
lower than u j c . Thus, the term in brackets in equation (9.5) need 
not be zero between the highest frequency in the data and jJ c - This 
freedom allows one to choose other functions a n (t) in the expansion 
(eq. (9.1)) which converge faster than the sine functions. These 
faster converging expansions for X(t) are useful in many applications, 
such as long distance telephone conversation and photographic image 
reconstruction. 


93 Effect of Aliasing on Power Spectral Density 

In many applications, one may not know the maximum frequency 
/max* What happens if one just chooses a sampling rate and estimates 
the power spectral density? 

Let X(t) be a stationary random process with power spectral density 

SxM = ^ /_* R X (t) «" iwr dr (9.8) 

If the signal is sampled at intervals At, then the autocorrelation Rxi 7 ) 
may be estimated (as will be seen later in this chapter) only for r = j At 
for — oo < j < oo. Thus, the power spectral density estimate, which 
may include aliasing, is given by the discrete expression (to be shown 
later in this chapter) of the integral in equation (9.8) 


At 


= ^ Z kxUto)*-” 3 '** 


J — « 
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c(T) 



for \ui\ < jJ c . The expected value of this estimate is 

= ^ E Rx( me-”’* 

j=-oo 

= m c{t)R x (t) e~ tuT dr (9.9) 

since RxU&t ) » which is estimated from discrete data, will be shown to 
be an unbiased estimate of Rx(jAt). Here 

oo 

c(r) = 6(r-j&t) (9.10) 

i=-oo 

is the Dirac comb function shown in figure 35. 

The Dirac comb function is periodic with period A £. Thus, it may 
be expanded in the Fourier series 



for all r. The Fourier transform of this series is 

CM = ^ j_ c ( r ) e~ tuJT dT = E 6 (“- 2 ku c) 


OO 


(9.11) 


another comb function, this time in frequency as shown in figure 36. 
Thus, the Dirac comb function is one of those unusual functions whose 
Fourier transform has the same form as the function itself. 
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C(co) 



Figure 36. Dirac comb function in frequency. 



Figure 37. Aliased frequencies in spectral estimation. 

Since equation (9.9) is just the Fourier transform of the product of 
two functions, by the convolution theorem (eq. (2.8)), it may be written 

£ = At f C(<^)5jy(w-~u/) (kJ = V Sx{<*}—2ku c ) 

J ~°° *— oo 

(9.12) 

which is a sum of the values of the actual power spectral density. In 
fact, for a power spectral density as shown in figure 37, power at all 
the frequencies denoted by arrows appears as power at the frequency 
(jjq in the aliased spectral density. It should be emphasized that 
figure 37 represents an atypical case, which would result from very 
badly undersampled data. 

Another way of looking at this phenomenon is to note that since the 
power spectral density is an even function of u, 

Sx(uQ — 2kbJ c ) = Sx{2Icu c -ujq) 

and power is seen to be “folded” about the Nyquist frequency, much 
as one would fold a fan, with power at odd multiples of uj c appearing 
at the frequency u c and power at even multiples of 'jJ c appearing at 
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the frequency of zero. This situation is illustrated in figure 38. This 
figure displays two estimates of the power spectral density of sunspot 
activity obtained from the data mentioned earlier over the years 1610- 
I960 10 . For the first estimate, the sampling rate was one sample per 
year yielding a At of 1 year and Nyquist frequency of V 2 cycle/year. 
This estimate has 20 degrees of freedom. Note the appearance of a peak 
near / = Vn cycle/year corresponding to cyclic behavior in the data 
with an 11-year period. Also shown in figure 38 is a spectral estimate 
of the same phenomenon with a At of 7 years. Thus, the Nyquist 
frequency is Vu cycle/year and the sunspot activity is undersampled. In 
this spectral estimate, the power in the signal at frequencies higher than 
the Nyquist frequency has been folded about the Nyquist frequency and 
now appears as power at lower frequencies. In particular, the peak at 
/ = V 11 cycle/year now appears as a peak near / = V 21 cycle/year 
corresponding to a period of 21 years. If the cyclic behavior was a 
pure harmonic with frequency V11 cycle/year, which was sampled with 
a Nyquist frequency of Vm cycle/year, the power at V11 cycle/year 
would alias back to the frequency V14 — (V11 — V14) = — V11 = 

V 77 cycle/year, which is close to the frequency shown. It might be 
mentioned that the apparent increase in the amplitude of the peak is 
due to the power preservation feature of the spectral estimates discussed 
in chapter VII. 



Figure 38. Power spectral density estimates of sunspot activity. 

When working with data of unknown frequency content, the only 
way to be sure of avoiding aliasing is to pass the signal through a 
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Figure 39. The windowed data function. 

low-pass antialiasing filter \ which filters out all power at frequencies 
higher than the Nyquist frequency based on the chosen sample interval 
At. Sometimes it is advantageous to do this even when the frequency 
content is known. For example, if one is interested in only the low- 
frequency content of a signal that also contains high frequencies, it 
is possible to reduce the sampling rate and thus the computations 
required by filtering out the uninteresting higher frequencies. Such 
an antialiasing filter must , of course, be an analog filter applied to the 
continuous data before it is digitized, since after digitization there is no 
way to distinguish between actual power and aliased power. This point 
cannot be overemphasized as insufficient sampling will cause the data 
to be not only worthless but also misleading, as illustrated in figure 38. 

9.4 Gibbs’ Phenomenon 

There is smother more subtle way in which aliasing can enter into 
digital spectral analysis. For such analysis, it is always assumed that 
the random process X(t) is band limited. However, to use a finite 
length of data, it is necessary to suppose that the random process X{t) 
is multiplied by a data window d(t) and to introduce the windowed 
Fourier transform (eq. (7.16)): 

Xf{u) = h /_~ d(t)X(t) e -^dt 

This finite transform is exactly the Fourier integral transform of the 
windowed data d(t)X(t), which is identically zero outside the region 
(0, Tq), as shown in figure 39. 

Now, it can be proved 4 that a function which is nonzero for only 
a finite interval of time cannot be band limited in frequency. Thus, 
the windowed data cannot be band limited and aliasing must result. I 

Unfortunately, nothing can be done about this aliasing as long as one is 
confined to a finite length of data. However, if Tq is sufficiently large. 
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the power at high frequencies necessary to reproduce the function is 
ordinarily so low that it does not destroy the analysis in the frequency 
range of interest. 

Another possible source of aliasing is seen by recalling that the 
Fourier integral representation (eq. (2.2)), 

f{t) = f°° F(u) e iut du 
J -00 

is exact only where f(t) is continuous. At a point of discontinuity to, 
it converges to the average value 

/(£) + /(fr) 

2 

of the right- and left-hand limits of f(t) as t approaches £q, provided 
that these limits exist. Thus, if there are discontinuities at the begin- 
ning and end of the record length, as shown in figure 39, the finite 
Fourier transform (eq. (7.16)) converges to the average value at these 
discontinuities. Further, because of Gibbs’ phenomenon, 4 if one tries 
to represent a discontinuous function by a Fourier integral over a finite 
range of frequencies, the representation produces high-frequency oscil- 
lations near the points of discontinuity. Only by allowing unbounded 
frequencies are these oscillations removed. Thus, the presence of these 
discontinuities causes more power at the higher frequencies and, thus, 
more aliasing. 

This end-point discontinuity source of aliasing may be reduced by 
using a data window that is zero at the ends of the record, that is, 

d(0) = d(T s )=0 


This removes the discontinuities in the record and reduces this 
cause of aliasing. The Hanning, cosine bell, and Bartlett windows, for 
example, fulfill this requirement. Some aliasing may still be produced, 
however, caused by discontinuous derivatives at the ends of the record. 

A similar analysis holds for the Blackman-Tukey method of estima- 
tion as well. Thus, one ordinarily requires the lag windows to satisfy 


u(— T th) — u(T m ) — 0 


£ 
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9.5 Relationship Between Continuous and Discrete Fourier 

Transforms 

Shannon’s sampling theorem allows determination of the relation- 
ship between Fourier transforms of continuous and discrete data for 
band-limited random processes. To eliminate the effects of aliasing in 
the following discussion, it will be assumed that all digital time histo- 
ries have been passed through an antialiasing filter and are thus band 
limited . 

Let X(t) be a random process that is limited to the band |u/| < jj c * 
where u/ c = ir/&t and A* is the sample interval. Then, its Fourier 
integral transform (eq. (2.3)) is given by 

x{uj)= ^I2o x{t)e ~ Mtdt 

= f dt e~ Ujjt ^ X(nAt)sinc(uj c t - mr) 

2n J -oo „=-oo 

= V X{nAt) f e~ Mt smc{<jj e t — mr) dt (9.13) 

n=-oo J ~°° 

where Shannon’s sampling theorem (eq. (9.6)) has been applied. 

Now, the Fourier transform of the sine function is 


-f 

2ir/-c 


1 sinc(u/ c £ — mr) dt = 


g-mtrtj/cjc roo 


“{o‘ 


— mirw/wc 


sine ye 

oo 

(M < *c) 

(Otherwise) 


— iujy/uc 


upon setting y = u c t - mr, since 


/ oo 

sir 

-oo 


„ / (1*1 < 1) 

\ 0 (Otherwise) 


Thus, equation (9.13) becomes 




& E X(nAt)e~^ nAt (|w|<«e) 


(Otherwise) (9.15) 


till I r I: L 1 


1' 
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4 Interval Represented ► 


Figure 40. Interval represented by digital data. 

since u t c = ir/At. This exact expression would allow calculation of the 
Fourier integral transform from discrete data X{nAt) for — oo < n < oo, 
and is an extension of Shannon’s sampling theorem; that is, the Fourier 
transform of the signal can also be reconstructed from the sampled 
data. 

In actual practice, the amount of data is always finite. Usually, 
knowledge of the values A'(nAt) for n = 0, 1, 2, . . . , N - 1 is assumed. 
These values are taken to represent the sample function A'(t) in the 
interval (— At/2, T—At/2) as shown in figure 40 where the dots indicate 
data points. That is, the data point X(£At) is thought to represent the 
time history in the interval ((£ - l /2)At,(i + Va)At). 

Now, consider the finite Fourier transform (eq. (7.12)) over this 
interval 


i rT— Af/2 i roo 

- a L, n x(,) * - s L do(,)x(,) * 

(9.16) 

where do (0 is a boxcar function taking the value unity for — At/2 < 
t <T - At/ 2. By the convolution theorem (eq. (2.8)) and substituting 
equation (9.15), 

/ oo 

Dq(u — u/)X(u/)du/ 

-oo 

A OO 

= £ X{nAt)e-'“ nAt D 0 (u")e'“" nAt du” 

* n=— oo Ju-ue 


where Dq{uj) is the Fourier transform of do(£). The latter integral, 
which is to be evaluated only for |u/| < u i c , represents an ideal filter 
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Introduction to Time Series Analysis 
operating on the boxcar function, that is, 

ft 

I(n, N,uj)= / Do(u") e M nAt du" % do(nAt) 

J u/— Wc 

For example, figure 41 presents the real and imaginary parts of the 
integral for N = 64 and w = w c /2. The real part of the integral very 
closely approximates the boxcar function, while the imaginary part is 
negligible. The case shown is typical. The approximation is better as 
N — oo and |w| — * 0 and worse as N — * 0 and |u/| — u/ c . From this 
approximation, equation (9.16) becomes 


X T M*£fx(nAt)e-™* 

n*0 


(9.17) 


and it can be seen that the finite summation on the right-hand side 
approximately represents the finite Fourier transform of the signal X(t) 
over the interval (-At/2, T - At/2). 

It should be noted that because of the assumed stationarity of the 
random process X(t), the average properties of the random process in 
the interval (-At/2, T - At/2) are the same as those in any interval 
of length T. Thus, fundamental relation (9.17) will be used as the 
definition of the discrete finite Fourier transform in this monograph. 


9.6 Digital Blackman-Tukey Estimation 

Based on the understanding of digital data analysis developed in the 
previous sections, it is possible to develop discrete forms for spectral 
estimation. Suppose that N samples X(nAt) for n = 0, 1. 2, . . . , .V - 1 
from a stationary random process exist. Then, taking T = ;VAt and 
r = j At, the autocorrelation estimate analogous to equation (6.7) is 

iV-y-i 

Rx(jAt) = XI A-(nAi)*[(n+»Afl (9.18) 

T »»0 

for j = 0, 1, 2, . . . , TV — 1. This yields an unbiased estimate of the 
autocorrelation at these discrete lag values, j At. 

Likewise, if T m = m At is the width of the lag window, the power 
spectral density estimate is defined as the discrete approximation to 
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Figure 41. The integral /(n, N,w) for N = 64 and w = w c /2. 
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equation (7.7), 


SxM = ir ti(0)R x {0) +2y'u(jAt)R x U*t)cosujAt (9.19) 

L J 

by analysis similar to that which led to equation (9.17). 

If data on two jointly stationary random processes exist, similar 
expressions for the cross correlation and cross spectral density are 

JV-j-l 

RxyU&t) = T— £ X(»At)y[(n+j)At] (9.10) 

J n*0 

N-j-1 

flyxO'Af) = -i- £ y(nA«)X((n+i)A«] (9.21) 

i 3 n*0 

for j ' — 0, 1, 2, . . . , N — 1 and 

Sxy * U E uO‘At)AxyO'At) «-*-*** (9.22) 

m 

since the cross correlation is not generally an even function of r. 

9.7 Discrete Finite Fourier Transform Estimation 

A similar discrete version of the finite Fourier transform spectral 
estimate may be developed. If the block size is Tq = b At, then 
equation (7.16) may be estimated by 

a* l 

X F (w) = =- E d(nAt)X(nAt) e -twnAt (9.23) 

27r n^O 

from the discrete data X(nAt) for n = 0, 1, 2, . . . , b - 1. 

The spectral density estimate is then given by equation (7.15), 

SxM = WsI*fMI 2 (9.24) 

For jointly stationary random processes, the cross spectral density is 

5*yM = W s X m F (u)Y F (u) (9.25) 
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where Yp(uj) is given by equation (9.23) with X(nA£) replaced by 
V'(nAt). The window correction factor W$ can usually be determined 
analytically. 

9.8 Frequency Domain Window Insertion 

Many times, particularly when one wants to investigate the effects 
of various windows, it is more efficient to insert the windows in the 
frequency domain. Expressions by which this may be achieved are 
quite easily developed. Note first that, although the spectral estimates 
in equations (9.19), (9.22), (9.24), and (9.25) are given as continuous 
functions of frequency, the data are sampled, and thus these expressions 
should not be evaluated at frequencies higher than u c = tt/A£, the 
Nyquist frequency. Further, since the data have finite length, the 
resolution criterion (eq. (7.33)) suggests that the estimates should 
not be evaluated at frequencies that are too close together. The 
actual resolution depends on the window chosen, of course. However, 
equation (7.33) provides a reasonable guideline. 


Blackman-Tukey estimation . Now, recall from equation (7.2) that 
the windowed spectral estimate may be written as 

SxM = u(t)R x {t) e~ luJT dr 

By the convolution theorem, (eq. (2.8)), this estimate becomes 

S x (u) = f°° U(u-u/)$o(<J)dJ (9.26) 

J — OO 


where, since u(r) = 0 for |r| > T m , 

5°(w) = J R X (r) e~ tuJT dr 

is the spectral estimate (eq. (7.1)) with the boxcar lag window (i.e., 
effectively no window at all). Thus, equation (9.26) states that the 
spectral estimate with an arbitrary lag window is the convolution of 
the spectral estimate with a boxcar window. This allows one to first 
estimate the spectral density with a boxcar lag window and then insert 
various other windows in the frequency domain. 

The frequencies at which spectral estimates are evaluated is a matter 
of choice . The standard choice in Blackman-Tukey estimation is 


kit kw 
Uk ~T^ = m&t 


(A: = 0, 1,2, . . . , m) 


101 



V 

L 


1 

1^ 


l 


v 

L 


£ 


5 ' 


L ii. L L 


UllUlililili LlIhliUUli 



Introduction to Time Scries Analysis 

which yields as many estimates as lag points were used, reaches the 
Nyquist frequency when k = m, and provides a bandwidth of 

Aw = or A/ = -f- (9.27) 

a m m 

Although this choice does not yield full resolution by equation (7.33), it 
does have the virtue that it allows simple window insertion. Note that 
in this case, the discrete analog of equation (9.26) is 

m 

Sx(wfc) = Aw 21 o(w>) ( 9 - 28 ) 

j'=-m 


since So(wy) = 0 for \j\ > m. 
Now, from equation (7.10) 


T 

u((jj k - ujj) = ~ £ an sinc ((* ~i - n H 


l 

2 Aw 


oo 

23 a n< 5 fc.j+n 

n*— oo 


Thus, equation (9.28) becomes 

j fc+m 

■Sx(^fc) = 2 E a n-?o(u;fc_ n ) 
n=2e— m 


(9.29) 


which allows window insertion in the frequency domain. Necessary 
terms for w < 0 are obtained from So(w_fc) = So(w*). Equation (9.29) 
shows that windowing is equivalent to applying a low-pass moving- 
average filter (to be discussed in chapter XI) to the unwindowed 
spectral estimate. For example, to apply the Hanning window where 
oo = 1, ai = 1/2, and all other a n ’s are zero, 

1 * 1 . 1 . 

$x(wfc) = -•S , o(w Jt _ 1 ) + -So(w fc ) + jSo(w;fc +1 ) 


Finite Fourier transform estimation . A similar expression for 
frequency domain window insertion may be developed for the finite 
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Fourier transform. Recall equation (7.16): 

Xf[uj) = h /_* d{> t)x{t) dt 

Applying the convolution theorem (eq. (2.8)) to this equation results in 

A>(w)= / D{u-J)X T {J)du' (9.30) 

J — oo 

where X^(u;) is the finite Fourier transform (eq. (7.12)) of the data 
with block length Tq = 6 Ai, 


X T {u) = 



dt 


since X(t) = 0 for t < 0 and t > Tq. 

Standard practice in the finite Fourier transform estimate is to 
assume that b is even and to evaluate the estimate at the frequencies 


2A:7r 2kir 

UJk ~Y^~ FKi 


(* = 0 , 1 , 2 , ..., 6 / 2 ) 


(9.31) 


This choice yields half as many estimates as data points, reaches the 
Nyquist frequency when k = 6/2, and provides a bandwidth of 


or A/ = =5- (9.32) 

T B Tq 

Note that, for a similar data length, this is twice the bandwidth of 
the Blackman-Tukey estimate and provides full resolution by equa- 
tion (7.33). The frequency choice in equation (9.31) also allows highly 
efficient computations as will be seen in chapter X. 

At these frequencies, the discrete analog of equation (9.30) is 


6/2 

X F (u k ) = &u D(u k -Uj)X T {uj) (9.33) 

3 — t >/ 2 


Now, recalling that the data window is defined only for 0 < t < Tq, 
analogous to equation (7.10), 


D(u) 


Tge-i^TB/^ 

4 IT 


oo 

yi a n sine 

n=— OO 


Of 
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Thus, 


D(u k - Uj) = — ^ a " sinc [(* ~ J ~ n ) 7r l 


f— 1)*— > 22 . 

47T 

(-1)*- J 
2 Aui 


n=— oo 
00 


X] °n^fcj+n 


and equation (9.33) becomes 


. k+b/2 

X F (uJ k ) = - E (-D n anX r (u; fc _ n) (9.34) 

n»A:-6/2 

which again allows window insertion in the frequency domain. Nec- 
essary terms for u < 0 are obtained from Xt(— k) = Xj.(u/*). For 
example, if the Hamming window where clq = 1.08, ai = 0.46, and all 
other a n ’s are 0 is used as the data window, 


= — 0 . 23 X 7 * 4* 0.54X7’(cjjt) — 0.23X7 , (u/jt-f.i) 

Because of the simplicity of equations (9.29) and (9.34), the data 
are practically always left unaltered in the time domain and any desired 
window is inserted in the frequency domain. 

9.9 Autocorrelation Estimation Via Discrete Fourier 

Transformation 

Estimates from discrete data are not always directly analogous to 
those from continuous data. For example, the autocorrelation estimate 
in equation (7.25) is unbiased for continuous data. However, for a 
discrete set of data X(nAt) for n = 0, 1, 2, ... , 6— 1, the discrete Fourier 
transform (eq. (9.23)), assuming a boxcar data window and evaluation 
at the frequencies 


Uk = bAt (*-0, 1,2 6/2) 

is given by 

XfM = If E *(»**) t~ i2 * kn,b = Xr(uifc) (9.35) 

n=0 
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Then, the spectral estimate (eq.(7.19)) becomes 

SxM = £t t \ X T(u k )\ 2 (9.36) 

which is an even function of u; since >-k) — This can be 

employed to estimate the autocorrelation, based on equation (7.25), as 

_ A 6 / 2-1 

R x (jAt) = — £ £ \X T M\ 2 ^^ 

fc =- 6/2 
6 / 2 — 1 

= W R (&u ) 2 £ \X T (u k )\ 2 e i2 * kj/b 
*=- 6/2 

The upper limit of the sum is b/2 — 1 because only one-half of the values 
at each end of the spectrum is added due to the discontinuity at those 
points. 

Now, note from equation (9.35) that 

^r(w-fc) = ^r( w 6-fc) 


since 

c -»27r(6-fc)>/6 _ e -i2n(-k)j/b 
Thus, the estimate may be written 


6 — 1 

AxU&t) = W R (Au) 2 J2 \X T (u k )\ 2 e t2vk ^ b (9.37) 
fc=0 

for j = 0, 1,2 ... ,6 — 1. 

How good is this estimate? Its mean is 

6-1 

E {fi x (jAt)} = W R {Au) 2 £ E {|X r (u, fc )| 2 } ^ (9.38) 

*=0 

where, by equation (9.35), 

E{|X r (u, fc )| 2 } = 

V / r=0 (=0 
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Introduction to Time Series Analysis 
Thus, equation (9.38) becomes 


E [**0*1)1 > £ E KxW - DAI] £ * 

r=0 £*0 

Now, note that for all real or complex values 



(9.39) 


This series is of fundamental importance in the analysis of discrete data. 
Thus, 


fc* 0 


k 


1 - c *2w (j-r+l) 

1 _ e »2»0'- r +<)/6 = b + ^'.6+*— <) 


(9.40) 

The two delta terms arise when n — r — £ is some integer multiple 
of 6, which occurs only for j — r + £ = 0 and j — r + £ = 6 since 
-(6 - 1) < j - r + £ < 2(6 - 1). Thus, 


E {A x (jA*)} » {(6 - + ;**[(<>-» At]} 

Recall that a boxcar data window results in a Bartlett lag window. 
Therefore 

"'•-srarH)’ 1 

and 


E {A*(jAt)} = RxU&t) + - j)At] 

Therefore, the estimate is seen to be biased. 


9.10 Zero Insertion 

The bias in this estimate was caused by the presence of the second 
term in equation (9.40). This term may be eliminated by the technique 
of zero insertion. Suppose that the original b data points X(nAt) for 
n = 0, 1, 2, . . . , b — 1 are augmented by b zeros, that is, 

X(nAt)=0 (n = 6,6+1 26-1) 
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Chapter IX Digital Time Series Analysis 


The discrete Fourier transform of this entire sequence is then 

At 26-1 

XtM - ff £ X(nAt)e- i2 ^ 2b 

n=0 

= ^Z X ^ e ~ tWkn/b (9-41) 

n=0 

Note that this has the effect of reducing u/£ by a factor of 2, that is t 

UJk = bAt (* = 0,1, 2,... ,6) 

with Au = w/b&t. Thus, the spectral estimate (eq. (9.36)) when 
zeros are inserted has the equivalent bandwidth of a Blackman-Tukey 
estimate. 

Now, analogous to equation (9.37), 


2/>— 1 


R x (jAt) = 2W R (&u) 2 £ |Xr(«*)l 2 

Ar=0 


(9.42) 


is the expression used to estimate the autocorrelation. The mean of 
this estimate is 


£?{Ajc(jAi)} = 2W r (*u) 2 £ £{|X r (w fc )| 2 } 

k*e 0 

where now 

V / r=0 e=o 

Thus, 


E [fixO'AO] = ^EE R x[(‘ - r W E [e iw °'- r+fl/6 ] 


26—1 


r=0 (=0 


Jt=0 
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Introduction to Time Series Analysis 
However, 


fca 0 


1 _ e i2ir(j-r+*) 

1 - g»r(i-r+7)/6 “ 


since j — r + I never reaches 26. Thus, 


£{AxC/At)} = = flx(j'Af) 


using the same 6-point window correction factor as before and the 
estimate in equation (9.42) is unbiased. The same zero insertion 
technique should also be used for estimation of a cross correlation. 


9.11 Digital Spectral Estimation Procedure 

Now that all of the required relations have been developed, it is 
possible to develop a decision-making procedure that one should follow 
when attempting to estimate power spectral densities with digital data. 
This procedure should be thought through before one ever begins to take 
data . 

Step 1: Axe the data stationary? Since all the analysis developed 
thus far is based on this assumption, these techniques are not valid if 
the underlying random process is not stationary. Whether or not a 
particular time history is a sample function of a stationary random 
process is a decision that must be based primarily on engineering 
judgment. However, a test of stationarity is discussed in chapter VI. 
If the data are not stationary, it may be possible to detrend them and 
make them appear stationary without losing the information of interest. 
This technique is discussed in chapter XI. 

Step 2: What is the maximum frequency of interest /max? This 
defines the sampling rate since 


2 /max 


Step 8: Does the signal contain power at frequencies higher than 
/max? ^ so* the analog data must be passed through a low-pass 
antialiasing filter before it is digitized. 

Step What frequency resolution is required? For the finite Fourier 
transform technique, the required resolution sets the block length Tq, 
since 

A/ = ^ (Tg = 6 At) 
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Chapter IX Digital Time Senes Analysis 


or if zeros are inserted, 



For the Blackman- Tukey technique, the required resolution sets the 
width of the lag window T m , since 


A/ — ^ {T m — m At) 

Step 5: How much accuracy is required? This sets the total length 
of data to be taken since the degrees of freedom for the finite Fourier 
transform technique are 


k = 2N B 

where N B is the number of data blocks. For the Blackman- Tukey 
technique, 

2T = 2N 
T m m 

where N is the total number of data points. If it is not possible to 
obtain this many data points, then one must relax the requirements on 
either resolution or accuracy or both. 
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Chapter X 

The Fast Fourier Transform 


In the late 1960’s, the field of time series analysis was completely 
revolutionized by the introduction of highly efficient techniques for 
computing the discrete Fourier transform. The application of these Fast 
Fourier Transform , or FFT, techniques on modern digital computers 
was popularized by Cooley and Tukey 16 in their paper entitled “An 
Algorithm for the Machine Calculation of Complex Fourier Series.” 
With the use of the FFT, the finite Fourier transform approach to 
spectral estimation became so computationally efficient that it replaced 
the Blackman-Tukey approach for most practical applications. Even if 
one is interested in only the autocorrelation, it is often more efficient to 
first estimate the power spectral density and then transform to obtain 
the autocorrelation rather than to use the more direct lagged product 
approach (eq. (9.18))! 

There are actually many FFT algorithms in common use today. In 
fact, most are tailored to take advantage of the particular architecture 
of the computer on which they are implemented. However, all are 
concerned with evaluating the discrete Fourier transform (DFT) 


N-i 

Z k =J2 Z 3 W* (fc = 0, 1, 2, , iV — 1) (10.1) 

j=0 

where z 3 is, in general, a sequence of complex numbers and 


w = e~ t2n / N 


is an iVth root of unity. Note that except for the scale factor At/2ir, 
equation (10.1) is precisely the discrete Fourier transform of equa- 
tion (9.35) if the Zj s are taken to be real. Note also that because 
of the properties of W, equation (10.1) is periodic with period iV and 
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Introduction to Time Series Analysis 

may thus be considered to be defined for all k ; that is, Z^y = Z^. 
The inverse of equation (10.1) is 

1 N ~ l 

*i~TfH z k w - ik ( 10 . 2 ) 

Ar*0 

since substituting equation (10.1) in equation (10.2) results in 


JV-1AT-1 Ar-i iV-i 

jEE -jiL-E 

k=*Q t^O t^Q fc=0 

1 ,V - 1 

<«0 

upon use of equation (9.39). 

If a single multiply-add operation is taken as a measure of com- 
putational work, straightforward calculation of equation (10.1) would 
take IV 2 operations, since for each k, the N data points Zj would 
have to be multiplied by the appropriate complex exponential and then 
added to the sum, requiring N operations for each k. However, since 
k = 0, 1, 2, . . . , IV — 1, there are IV 2 operations for the complete calcula- 
tion. This discussion assumes, of course, that the complex exponentials 
W have been previously calculated. 

10.1 Theory of the Fast Fourier Transform 

The basic idea of the FFT goes back at least to 1903, 17 when Runge 
noticed that if the number of data points N is not a prime integer, 
the number of operations can be reduced by splitting the calculation 
up into parts. Consider the simplest case when N = AB. Then the 
data may be broken up into A subrecords of length B as shown in 
figure 42, where a = 0, 1, 2, . . . , A — 1 is the index of subrecords and 
6 = 0, 1,2, — 1 is the index within a subrecord.. Then the time 

index j in equation (10.1) may be written 

j = aB + b (10.3) 

which simply states that b is equal to j modulo B. Likewise, there will I 

be IV = AB values of the frequency index k. Suppose the frequency 
data points Zk are broken into B subrecords of length A. Then letting 
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Chapter X The Fast Fourier Transform 


H-t 1 — I— I 1 

012 B-1 BB+1 2B-1 

length B 

a=0 a =1 


AB=N 


a=A-1 


Figure 42. A subrecords of length B. 


c = 0, 1, 2, . . . , B— 1 be the index of subrecords and d = 0, 1, 2, . . . , A — 1 
be the index within a subrecord yields 


k = cA + d (10.4) 

Thus, equation (10.1) may be written 


ZcA + d = E E ZaB+bW {aB+b)(cA+d) (10.5) 

< 2=0 6=0 

where 

yy(aB+b)(cA+d) __ ^yacABy^adB^bcA^ybd 

However, 

yyacAB = yyacN _ i2irac _ ^ 

and thus this term need not enter the computation. Then, equation 
(10.5) may be written 


Twiddle 

factor 

ZcA+d = E ^ wM E z *B+bW adB (10.6) 

6=0 a =0 

1st transform 
2nd transform 

This expression has the character of a double Fourier transform. The 
first transform 

E ZaB+bW** 8 

a=0 

requires A operations and must be done for each b (B of them) 
and each d (A of them). Thus, A?B operations are required by 
the first transform. Each of these transforms ( AB of them) must 
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Introduction to Time Series Analysis 

then be multiplied by the appropriate exponential called the 

“twiddle factor” by Gentleman and Sande. 18 This requires another 
AB operations. Finally, the second transform requiring B operations 
must be accomplished for each c (B of them) and each d (.4 of them). 
Thus, AB 2 operations are required by the second transform. The total 
number of operations yv 0 p is 

iV op = A 2 B + AB + AB 2 = AB{A + 1 + B) = N{A + B + 1) 

compared with N 2 for the direct evaluation. Suppose, for example, 
that N = 100 = 10 x 10. Then the 10000 operations required by the 
direct transform would be reduced to 2100 by splitting the calculation 
into parts. Further, this reduction is due to only one reduction. If .4 
and/or B is not prime, the technique can be applied again. 

The ultimate reduction of course comes when the number of data 
points is some power of 2, say 2 P . Then the calculation can be 
reduced to 2 P - 1 transforms, each of length 2. To see this, note. that 
equation (10.1) may be written 


Z k =Ez^ 2 ^ /N 
i- o 

N/2—\ N/ 2-1 

= £ z 2>e - i2 »*( 2 >)/"+ £ z 2}+l e~ t2 ^ +l ^ N 

j* o y* o 

Even terms Odd terms 

N/ 2-1 N/2-1 

= £ z 2j e~ i2vk] ' /{N/2) +e~ i2irk/s Z 2 }+ie~ t2irkj/(N/2) 

J*0 T J«0 

^ ^ ^ Twiddle ■■■ ■ ^ 

Transform of factor Transform of 

N/2 points i.V/2 points 

(10.7) 

Thus, by separating the transform into even and odd terms, the 
calculation has been reduced to two transforms of N / 2 points. Ag ain 
note the appearance of the twiddle factor. Now, jV/2 = 2 P_1 , so that 
the procedure may be repeated until the ultimate reduction is achieved. 
For example, the transform of eight data points would be accomplished 
as shown in figu-e 43 where the dots indicate transforms of two data 
points. This scheme results in a total number of operations equal to 
N log 2 N rather than the N 2 for the direct approach. For example, with 
N = 128, the direct approach would require 16 384 operations while the 
FFT would require only 896. 
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Chapter X The Fast Fourier Transform 


Z 0 Z 4 Z 2 Z 6 Z 1 Z 5 ^2 *7 



The above explanation, found in Nussbaumer, 19 is one of the most 
easily understandable. For actual coding, binary notation is introduced 
for the time and frequency indices and the transforms are calculated 
recursively. To achieve the desired speed, the complex exponentials 
must be calculated efficiently as well. By using the fact that W T = 
WW 7 *” 1 , this calculation can also be done recursively, with evaluation 
of only W itself required from a series. Great care is also given to 
the storage of the intermediate results since the data are used in 
complicated orders as can be seen in figure 43. The k values of the 
transform also do not remain sequentially ordered. Thus, a massive 
schuffling of data is required. For this reason, there are many different 
versions of the FFT, which are very computer specific. Good references 
to these methods are Otnes and Enochson* 2 and Nussbaumer. In 
spite of the seeming complexity of these techniques, implementation is 
not overly arduous. For example, Gonzalez and Wintz 20 published a 
FORTRAN program for an FFT algorithm consisting of only 34 lines 
of code, including generation of the complex exponentials W\ 

10.2 Properties of the Discrete Fourier Transform for Real- 

Valued Data 

Although the idea of an FFT can be employed only if the number 
of data points N is not prime, the discrete Fourier transform (DFT) 
(eq. (10.1)) always exists. Further, the DFT is sometimes necessary for 
use in spectral estimation when there are constraints on the number of 
data points or greater freedom in the selection of frequency resolution is 
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Introduction to Time Series Analysis 

required. However, it should be emphasized that the DFT is generally 
computationally less efficient than the Blackman-Tukey technique. 

The DFT (of which the FFT is a subset) has some useful properties 
for real- valued data Zj = xy. In this case, note that equation (10.1) 
may be written 




j- 0 


(& = 0, 1 , 2, . . . , iV — 1) 


Now, 


Further, 


X-k » Y, = XI 

j=o 


Xs-k = Xj e- i2 ^~ k ^ N = 

j=0 jmt 0 

= ■€“•*/" = x; = a :.. 

i - o 

since = 1. Thus, the DFT is periodic with period /V since 

X N „ k = 

Further, only N/2 of the points need to be calculated since 


= x; 


(Recall that the Nyquist frequency occurs at k = N/2 if N is even.) 

For real data, the power of the DFT is not being fully utilized since 
only half of the transforms are of interest. However, suppose one has 
two real sequences Xy and y 3 to be transformed. Then one can make 
better use of the DFT by defining complex data 


Zj = Xj + iy 3 

Then 

Zk = £ (*;• + *»>) e" 2,rfc>//V = + IT* 

j=o 
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Chapter X The Fast Fourier Transform 


and 


;V— 1 


z;v-k = E (*i - *'*i> «"‘ 2,r * i/ ' v = ** - 

J=0 

Thus, the transforms of the real sequences can be recovered by 


= £ V i2 ' ww = 

J=0 


and 


1=0 


Zk ~ f^-fc 
2 i 
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Chapter XI 


Digital Filtering 


A filter is any physical device or mathematical operation which is 
applied to a time history in order to change it in some way . Filters are 
usually broadly classified as low pass , passing the low frequencies and 
attenuating the high frequencies, high pass, passing the high frequencies 
and attenuating the low frequencies, band pass , passing frequencies in 
some band while attenuating both the higher and the lower frequencies, 
and band reject, attenuating the frequencies in some band. Many times, 
it is of interest to perform such an operation after the time history is 
digitized. Thus, one is led to the subject of digital filters. 

11.1 Linear Filters 

The most prevalent filters are linear. Consider the ordinary linear, 
shift- invariant system shown in figure 44, where X(t) and y(£) are 
related by 

Y{t) = h(r)X(t — r) dr (11.1) 

J -oo 

By definition, h(t) is a filter since the time history X(£) is changed into 
the time history Y{t). If X(£) is band limited and is known only at 
discrete times nAt for — oo < n < oo, then by Shannon's sampling 
theorem (eq. (9.6)) and equation (5.3), equation (11.1) may be written 


oo 

Y(t) = — V X(nAt) 

uJ C * 

n»— oo 
oo 

= — Y' X(nAt) 
Wc 

n» — oo 



dwH{u) 



tf(w)e ,w ( | ~ nir / u ' c > du 


sine y e“* tu; v/ w e dy 


(11.2) 


Further, since X(£) is band limited and the system is linear, Y(t) is 
also band limited because of the convolution theorem (eq. (2.9)). Thus, 
evaluating equation (11.2) at t = k At results in 
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Introduction to Time Scries Analysis 


X(t) 



Y(t) 


Figure 44. Linear filter. 




r 

I . 


n cub 


00 00 

r(*Af)= h b [(k-n)At]X(nAt) = £ h b (jAt)X{(k - j)At\ 

n*— oo j'ss — oc 

(11.3) 

where 

MO = ^ J H(u)e Mt du 

Note that /i&(£) is not quite equal to h(t) since A(£), being a transient 
function, cannot be band limited. Thus, physically represents 

a filtered impulse response function. 

Based on equation (11.3), a linear digital filter is defined to obey the 
discrete convolution relation 


n* £ hjX^ (n.4) 

i m — oo 

where Y k = Y(kAt) and X*_ ; = X[(fc - j)Ai]. Equation (11.4) 
represents a simple or nonrecursive digital filter operating on the data 
X(nAt) and may be implemented by choosing a set of hj’s. Note that 
for causality, hj = 0 for j < 0. However, if the data have been stored on 
magnetic tape, one need not be constrained by causality. This freedom 
is a fundamental difference between analog and digital processing which 
can be exploited to one’s advantage at times. A filter of the type 
represented by equation (11.4), applied in the frequency domain, has 
already been seen in equation (9.29). 

The frequency response of the filter is determined from the frequency 
response function 

H( W)= f; h je -^ At (11.5) 

3 m - oo 

which can be seen to be a discrete approximation to equation (5.2) 
utilizing equation (9.15). 
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Chapter XI Digital Filtering 


H(co) 



Figure 45. Frequency response of three-point moving average. 

One of the most common and useful digital filters is called the 
moving average , which can be used for trend removal. For example, 
for a three- point moving average, one would take h 3 = 0 for |j| > 1 to 
yield the noncausal relation 

Y k = h. iX k+l + hoX k + /ux*.! (n.6) 

The frequency response of the filter, from equation (11.5), is 

H(u) = h- ie luiAt + ho + 

= ho + (Ai 4* /i_i) coawAt - i(h\ - /i-i) sin uAt 

For a low-pass filter, the normalization 


H(0) = Hq + h\ + /i — i = 1 

is conventionally applied to ensure that a dc (constant) signal will pass 
through the filter unchanged. Further, because digital filters operate 
only in the range (0, ir/A t) where rr/At is the Nyquist frequency, setting 

H(w/At) = /io-(Ai + /i-i) =0 

ensures that the frequency response will be low at high frequencies. 
Then, taking h\ = h_i to make the frequency response function real 
(i.e., the phase is zero) and solving these three relations yields 

*0 = \ hi m ft— I = 1 


Thus, 


H(u>) = -(1 +coswAl) 


which is a simple low-pass filter as shown in figure 45. 
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Moving-average (low-pass) filters are especially useful for removing 
trends from data. Suppose, for example, that one had a finite-length 
digital time history as shown (continuously) in figure 46. The data 
in this figure are decidely nonstationary, and if this time history were 
analyzed directly, it would have practically all of its power at very low 
frequencies. A time history like this might be produced by the price 
of a stock on a stock exchange (where the values X(t) would all be 
positive) or the surface temperature of a spacecraft undergoing reentry, 
for example. 

Suppose one were interested in the high-frequency components 
rather than the long-term trend in the data. If so, the data could 
be passed through a moving-average filter, that is, 

Yk-\Xk+i + \x k + \x k _, 

and Y(t), again shown continuously, would appear as shown in figure 47. 
Actually, to achieve this much smoothing, the data might need to be 
passed through the filter several times in sequence. Such a series of filter 
operations is equivalent to applying a higher order (i.e., more nonzero 
h's) filter once. For example, two applications of the operation in 
equation (11.6) would produce a frequency response function // 2 (u/) = 
l U (1 + cosu;At) 2 = Vs + Va cos wAt 4- Vs cos2uAt, which might also 
be produced by one application of a five-point moving average. 

After sufficient smoothing has been achieved, the Y k s could be 
subtracted from the X k s to yield 

Zk = X k - Y k 

where Z(t) would appear much more stationary, as shown in figure 48. 
In this way, a much better estimate of the power at the higher 
frequencies could be obtained. Note that this moving-average operation 
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Chapter XI Digital Filtering 


Y(t) 



Figure 47. Long-term trend in time history. 
Z(t) 



Figure 48. Detrended time history. 


Y(t) 


Figure 49. Schematic of recursive filter. 

is exactly equivalent to passing the input X(i) through a high-pass 
moving-average filter with weights h$ — l h and /i_i = h\ =■ — X U. 

11.2 Recursive Filters 

A much more computationally efficient filter is one where the output 
is fed back into the input, as shown in figure 49. 

In general, the delay (or shift if the independent variable is not 
time) may be considered a bank of delays where the output is delayed 
by various delay times before being fed back into the input. Such a filter 
may, of course, be unstable (i.e., unbounded output) if the feedback is 
too large, just as a microphone-amplifier-speaker system will begin to 
screech if the gain is too high. Mathematically, it can be shown by an 
argument in the complex plane that such a filter is stable (i.e., bounded 
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Introduction to Time Series Analysis 

output) if and only if all the roots of the denominator of its transfer 
function have positive imaginary parts . 

First order recursive filter . Let 

Y k = aY k - X +0X k (11.7) 

where a and 0 are real. This is called a first order recursive filter 
because it uses only a single delay of time At. Equation (11.7) is an 
example of a difference equation and may be solved by defining the 
generating function 

Y(z)= f; Y k z k (11.8) 

O O 

where z is complex. If equation (11.7) is multiplied by z k and summed 
over all k, the equation 

Y{z) = azY{z) + 0X{z) 

results where X(z) is defined similarly to Y(z) in equation (11.8). Thus. 

Y(z) = y^Tz X{2) = 0[1 + aZ + (a2)2 + ' ’ ' ]X{Z) (U ' 9) 

assuming that \az\ < 1. Thus, equating coefficients yields the solution 
of equation (11.7), 

n = /? fVXfc.y (11.10) 

J-0 

which can be seen to be bounded for arbitrary bounded input if and 
only if |q| < 1, since otherwise the magnitude of the coefficients would 
increase with j. 

Consider now the transfer function for this recursive filter. Suppose 
one takes z = which satisfies \az\ < 1 for a bounded output 

filter. Then, equation (11.8) becomes 

Y ( :«)■ f; Y k e 

fc *— 00 

which except for the scale factor is the discrete Fourier transform 
(eq. (9.15)) of the band limited process Y(t), Further, equation (11.9) 
becomes 

Y{u) = H{ u)X{u) 
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Chapter XI Digital Filtering 


where 

H{u) = r— (11.11) 

is the transfer function of the first order recursive filter. The root of 
the denominator of this transfer function is given by = 1/a or 


UJ 


^ln(i) (a > 0) 

+ ^J ln (~a) ( a < °) 


and it can be seen that the condition |a| < 1 obtained for the 
boundedness of the equation (11.10) is precisely that required to cause 
the root of the denominator of the transfer function to have a positive 
imaginary part as required by the stability criterion. This condition 
also is intuitive since if it were not true, the output would be amplified 
before being fed back. 

Returning to equation (11.11), the squared magnitude of the transfer 
function is 


l#MI 2 = 


1 + q 2 - 2q coswAi 


( 11 . 12 ) 


and it can be seen that 


|tf( 0)| 2 = 


p 

(1-a) 2 


and 

Thus, for 0 < a < 1, the filter is low pass as shown in figure 50, and 
for —1 < a < 0, the filter is high pass as shown in figure 51. In these 
figures, the values of 3 have been chosen to make the gain unity at the 
respective passband ends. 


Second order recursive filter . More freedom in tailoring the shape 
of the transfer function may be obtained with a second order recursive 
filter: 

Y k = a x Y k ^ l + 02^-2 + (11.13) 

A similar analysis yields 


Y M = 


3 

1 - aie- lu ^ £ - a 2 e” lw2 ^ £ 


xm 
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|h(co)| -k a <0 



and the transfer function is seen to be 


H( w)- 


1 - aie-'"* 1 - a2«'" lw2A£ 
For stability, the roots of 


(11.14) 


1 - a 1 e“ ,u,A£ - a 2 <r lu;2 ^ = 0 

must be examined. For convenience, defining s = e lw ^ £ = l/z and 
multiplying by s 2 yields 


s 


2 


— ai3 — c*2 


0 


with roots 

ai ± \far + 4a — 2 

*1.2* 2 (1115) 
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Figure 52. Stability diagram for second order recursive filter. 
Therefore, the roots in terms of u> are 


^ 1,2 


— * 
At 


In 31,2 = 


— I 

At 


^ 1.2 

At 



(ln|^i t 2l + idix) 


where d\£ are the polar angles of the roots. Thus, an equivalent 
condition for stability is 

|*ul < 1 (11-16) 

Now, if a\ + 4<*2 < 0 in equation (11.15), then 3i = 3% and 
kl,2l = (-^2) i/2 


Thus, (-a 2 ) 1 ^ 2 < 1, which implies that a 2 > —1. However, if 
aj + 4 c*2 > 0, then the roots are real and unequal. The stability 
conditions on these roots lead to Q 2 < 1 - c*i and 1 + cti > <*2 and 
therefore to the stability diagram shown in figure 52. In summary, if 
the parameters a\ and 02 are chosen in the triangular region, the filter 
is stable. 

Returning to equation (11.14), the squared magnitude of the transfer 
function is given by 


l^( w )l (1 + 4* Q 2 ) - 2a i(l - 02 ) cos uAt - 2q 2 cos 2 uAt 


(11.17) 

By choosing various values of Qi, 02 , and /?, one can tailor this transfer 
function to have desired characteristics. For example, if <*2 = 0, then 
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Ih(co)I 



|H(CO)l 



the second order recursive filter reduces to the first order recursive filter 
studied in the previous section. Thus, a\ > 0 produces a low-pass filter 
and ai < 0 produces a high-pass filter. However, if ai = 0, then 

|tf(o)l 2 = (1-1 2 )S = \ H (7r/ ^ )|2 

while 

Thus, for a 2 < 0, the filter is band pass as shown in figure 53, while for 
c *2 > 0, the filter is band reject as shown in figure 54. In these figures, 
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the values of 3 have again been chosen to make the gain unity in the 
regions of interest. 

By taking both ai and Q 2 to be nonzero within the stable region of 
figure 52, various weighted combinations of high-pass, low-pass, band- 
pass, and band-reject filters may be obtained. More sophisticated filters 
with even more freedom in their shape may, of course, be developed. A 
good reference is Otnes and Enockson. 12 
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Chapter XII 
Special Topics 


The field of time series analysis is vast and rapidly changing. In an 
attempt to provide both complete and current coverage, this chapter 
presents a potpourri of specialized topics and areas where research is 
currently in progress. 

12.1 The Kendall Series — A Test Case 

In the development of computer codes to implement the various 
estimation techniques discussed in this monograph, it is extremely 
useful to have a benchmark time history for which the various statistical 
moments are known theoretically. One particularly simple way of 
generating such a time history is to use the Kendall series. 

The Kendall series Y n is generated by the recursive relation 

y n = axYn-x + a 2 Y n - 2 + (n « 0, 1 , 2 ,.. .) (12.1) 


where X n is a sequence to be specified later. This series has been 
studied by Bartlett 11 and can be seen to be a second order recursive 
filter (eq. (11.13)) operating on the X n ’s. Equation (12.1) may be solved 
by a technique similar to that used in solving equation (11.7) to yield 




3l - 3 2 


3 1 - *2 




ill 


Jfc»0 


3 1 * *2 


n — k 


where Y-i and V _ 2 are the initial conditions and the s’ s are the roots 


* 1,2 


Q\ ± yja I + 4Q2 

2 


which are assumed distinct; that is, + 4 a 2 7 * 0. The roots |si, 2 l are 
less than unity for stability; thus, as n — ► 00 , the solution approaches 


m iJ2L H ITENTiQNALLY BLANK 
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Introduction to Time Series Analysis 
the steady-state solution 


E 


(.?+'- air 1 ) 

31 - 32 


Xfl—k 


( 12 . 2 ) 


Now, suppose that the terms in the input sequence X n are indepen- 
dent and identically distributed random variables such that 


E{X n } = 0 


and 

E {-Xn-Xn+m } = 0 

That is, the input random variables simulate a white noise random 
process. Then, when steady state has been reached, the mean value of 
K n is 

E{Y n }= 0 

and its autocorrelation is 


Ry(r) = E{V n Vn+r} = As[ + Bs \ 


where 


and 


A = 


B = 




{dl -S2)(l -3\32)(l -3i) 




(ai -s 2 )( 1 -3i3 2 )(l -s\) 

Thus, Y n is a weakly stationary random process. Further, 

fO (r > 0) 

Ryx(r) = E {Y n X n+r } = | 4(,{^- f ) (j . < Q) 

and X n and Y n are jointly weakly stationry. 

In this case, both the power spectral density 

SyM = ^ L Ry(r)e~™ 

r»-oo 

= — ( 1 + 1 - A 

2tT \ 1 - 3ie -tu/ + 1 — 316*“ 7 

B_ { 1 + 1 _ \ 

+ 2 tt V l - s 2 c- ,w + 1 - s 2 e lw / 
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(taking At = 1) and cross power spectral density 


S Y x(u) = H Ryx( r ) e 


JL. 


2tt [l - ( 3 1 + 32)e luJ + 5i 32« j2w . 


( 12 . 6 ) 


exist for \w\ < 7r. It might be noted that these relations are independent 
of the distribution function for the random variables X n . 

Most computer systems have standard software that generates white 
noise data with mean zero, any desired variance cr\ y and many different 
distribution functions. Thus, one may choose values of ai and a 2 (such 
that the filter is stable according to fig. 52), use the initial conditions 
Y-\ = YL 2 = 0 (thus starting off with the steady-state solution), and 
recursively compute equation (12.1) to yield a set of digital random 
data of any desired length. Any program to analyze such data can then 
be exercised and the results compared with the theoretical expressions 
(12.3), (12.4), (12.5), and (12.6). 

For example, figure 55 shows a comparison of the theoretical expres- 
sion (12.3) for the autocorrelation with estimates of the autocorrelation 
obtained from the classical lag product approach (eq. (9. 18)). 21 The 
data used in the estimate shown in figure 55(a) were generated by the 
series in equation (12.1) with a\ = 1.1, 02 = —0.5, a ^ = 0.333, and a 
uniform distribution of the random variables X n . For figure 55(b), the 
conditions were the same except that the random variables were nor- 
mally distributed. Note that the autocorrelation estimates are nearly 
the same in both cases and agree well with the theoretical expression 
except near the lag value of 22, that is r = 22 At. This difference is 
apparently due to spurious correlation induced by the white noise gen- 
erator. Such spurious correlation is often seen as it is very difficult to 
generate truly uncorrelated random variables. 22 

Figure 56 shows a similar comparison of power spectral density 
estimates with the theoretical expression (12.5). These estimates 
were obtained by transforming the autocorrelation estimates shown 
in figure 55 according to the standard Blackman-Tukey technique 
(eq. (9.19)) with a Hanning lag window. The number of degrees 
of freedom was 1000 for this estimation technique. Note the good 
agreement between the theoretical expression and the estimates in both 
cases. 

In figure 57, estimates of the magnitude and phase of the cross 
power spectral density are compared with the theoretical expression 
(12.6). The data utilized in this study 23 were generated with the 
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T . *•<= t . sec 

(a) Uniform diatribution. (b) Normal distribution. 

Figure 55. Comparison of theoretical and estimated autocorrelations. 


o Theoretical 
u Kendall series 


c Theoretical 
A Kendall series 



'• Hz f. Hz 

(a) Uniform distribution. (b) Normal distribution. 


Figure 56. Comparison of theoretical and estimated power spectral densities. 


134 


L l 1 I. 1 


1 w 

l 

k 




\ 

L L L L L L' L L L' li L' U L' l 



Chapter XII Special 



f. Hz 

(a) Cross spectral magnitude. 



(b) Cross spectral phase. 

Figure 57. Comparison of theoretical and estimated cross spectral densi 
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Figure 58. Comparison of theoretical and estimated cross correlations. 

same parameters as before and normally distributed random variables. 
However, the cross spectral density was estimated directly using the 
finite Fourier transform technique (eq. (9.25)) with a Hanning data 
window. This estimation technique had only 64 degrees of freedom. 
Note that the cross spectral magnitude generally follows the trend of 
the theoretical expression. However, there is much random variability 
because of the small number of degrees of freedom. The cross spectral 
phase estimates, however, agree exactly with the theoretical expression. 

The cross correlation of these data was estimated by transforming 
the cross spectral estimate shown in figure 57 using the finite Fourier 
transform technique (eq. (9.42)) adapted for a cross correlation. This 
estimate is compared in figure 58 with the theoretical expression ( 12.4). 
Note that reasonably good agreement is achieved although there is again 
a deviation near a lag of 22. 

12.2 AR, MA, and ARMA Models 

Results like that of the previous section, in which the passage 
of a whito noise signal through a second order recursive filter leads 
to an output random process whose statistical characteristics can be 
determined analytically, have stimulated much interest in attempts to 
model arbitrary random processes by the passage of white noise through 
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various types of filters. Such modeling is the topic of considerable 
current research. 

The Kendall series considered in the previous section is a simple 
example of an autoregressive (AR) model of a random process, that is, 

Yn = a l^n — 1 + G 2 ^n — 2 + ' * * + &pYn—p + Xn ( 12 . 7 ) 

where X n is a white noise process. Equation (12.7) is called an 
autoregressive model of order p. Similarly, a moving average (MA) 
model is defined by 


Y n = boX n + b\X n -i 4* b 2X n -2 H + b q X n -q 


where q is the order of the model. These models are special cases of 
the more general relation 


Yn — a\Y n — { 4* a 2 r n - 2 + * * + QpYn—p 

4- b(^X n 4* b\X n ~i H 4- bqX n -q (12.8) 


which is called an autoregressive-moving-average (ARMA) model of 
order (p,q). 

When a random process Y (t) may be represented by such a model, it 
admits solution just as did the Kendall series and thus its moments are 
well understood. For example, multiplying equation (12.8) by e * um ^ £ 
and summing yields 


Y{u) = 


6q 4- b\z 4- byz 2 4 f b q z q 

1 — a\z — — • • * — OpzP 


X{u) = H{u)X(u) 


where z = and K(u;) and X(u/) are defined by equation (11.8). 

Thus, X(t) is the output of a linear filter with frequency response 
function H(u). 

Assume that H(u) is a stable filter and note that the input X(£) is 
a weakly stationary random process with 


E{X n } =0 


and 

Then, as n 


E {X n X n + m } — n,0 

oo, Y{t) is a weakly stationary random process with 


E{Y{t)}m 0 
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and 

s Y (u) = |tfM| 2 s*M = Ml 2 

Thus, such models can represent any mean zero random process whose 
power spectral density can be expressed as 


Sy'(w) = 




i >0 -l- b\z + b^z 2 + ■ 

• + b q z* 

1 — Q\Z — <22 Z 2 ~ * * 

■ — OpZP 


2 


in which p + q 4- 2 arbitrary constants are available to match the power 
spectral density. 

This result has led to much interest in representing real data 
sequences in terms of such models. Many different techniques have been 
developed for choosing an appropriate set of a’s and b’s. For example, 
for an AR model, multiplying equation (12.7) by Y n - r for some integer 
r greater than or equal to 0, taking expectation, and using the fact that 
#y( -r ) = rty(r) yields 

fly(r) = oif?y(r-l) + a2RK( r— 2) H \-OpRy (r— p) -l-cr^-<5 r ,o (12.9) 

since E{X n Y n } = E{X} t } — a This recursive equation may be used 
to extrapolate the autocorrelation from known values for an AR process. 
However, for the purpose of estimating the coefficients, note that if 
equation (12.9) is evaluated for r = 1, 2, . . . , p, the matrix equation 


■ Ry( 0) fly(l) • 

Ry(l) Ry(0) 

f 1 
atez 


’ar 

a 2 


r«y(Dl 

Ry(2) 

_Ry(p-l) Ry(p- 2) . 

■■ Ry( 0) . 


* a P- 


Ry(p). 


results. Thus, if estimates Ry (jAt) for j * 0, 1, 2, ... ,p of the autocor- 
relation of a random process are obtained from equation (9.18), time 
may be considered to be nondimensionalized by At and appropriate co- 
efficients for representing that process by an AR model can be obtained 
by solving the matrix equation. 

113 Data Adaptive Spectral Estimation Techniques 

Spectral estimation techniques such as the Blackman- Tukey or 
finite Fourier transform are said to be nonadaptive in the sense that 
the characteristics of these techniques are the same for all sets of 
data. That is, the algorithms for their implementation treat all 
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sets of data in exactly the same way. Recently, two new spectral 
estimation techniques, the maximum entropy method (MEM) and 
maximum likelihood method (MLM) have been developed. These 
methods are said to be adaptive since their design is data dependent. 
The characteristics of these techniques adapt to the particular data 
being analyzed. Such techniques allow mucA higher resolution than the 
nonadaptive techniques and are particularly useful when resolution is 
limited by short data lengths. 


Maximum entropy method (MEM). This technique was introduced 
by Burg. 24 The basic idea is to choose as the spectral estimate the 
power spectral density of the most random (i.e. t maximum entropy ) 
time series whose autocorrelation agrees with the known values . It can 
be shown that this amounts to extrapolating the autocorrelation to 
larger lag values than can be estimated from data of length T as shown 
in figure 59. The extrapolation is done in such a way that the entropy is 
maximized; that is, as little information as possible is added. Since no 
lag window exists in this case, the resolution is theoretically unbounded 
and the estimate is unbiased . 

Suppose that discrete equally spaced data X(nA£) for n = 
1 , 2 ,..., JV from a stationary random process X(£) exists. Shannon 25 
has defined the entropy of the random process as 

H n = ~ J ^ fx(x) In [c 2 ^/x(x)] dx (12.10) 


where /x(x) = /x(xi, 12 , . . . ,xjv; At, 2 At, . . . , iVAt) is the iVth order 
density function of the random process X (t) and c is am arbitrary 
constant. 
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Now, suppose that X[t) is a normal random process with mean zero 
(see chapter III). Then 


.V y 


/X(x) = 


(2jr) iV / 2 |A| 1 / 2 


exp 


1 

“ouT ^ ^ |A| mnX m X n 
' 1 m=l 7i=l 


where X m = X(mAt) and |A| is the determinant and |A| mn is the mnth 


cofactor of the matrix of correlations, 



R x(0) 

Rxi&t) 

.. R x [(N-l)At\] 

A = 

Rx(At) 

R x(0) 

. R x [(N-2)At] 


Rx[(N-l)At) 

R x [(N-2)At] .. 

Rx( 0) . 


In this case, for proper choice of the constant c, 




As N — > oo, H lW — oo. However, it is possible to define the entropy 
rate to be 


h = 


lim 

N~»oq 


Hn 

N 


lim i In |A| 1/JV 
iV— oo 2 


Further, since the autocorrelation depends on the power spectral den- 
sity, it can be shown that 


Jifoo |A|V " ■ f exp { S' JZ, N*** Ml <(*} 

where u/ e = ir/At is the Nyquist frequency and it has been assumed 
that Sx( u/) = 0 for |cj| > lj c . Thus, 

1 / (jJr \ 1 

h = 2 * n (~) + 4^ L du 


Recall that 


Sx(u) = U £ R x(mAt) 

m=-oo 
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and thus 


/i = 



In 


m=— oo 


du 


( 12 . 11 ) 


From the data, estimates of the autocorrelation RximAt) for m = 
0, 1, 2, . . . , N — 1 can be obtained. The desire is to extrapolate these 
values to obtain other estimates for m > N in such a way 

that the entropy rate (eq. (12.11)) is maximized. Thus 


dh 

dRx(mAt) 


= 0 {m>N) 


or 


/ (jJc 
-We 


iumAt 


-duJ — 0 


Wc E R X {mAt)e~ iuTnAt 

•=-O0 


( 12 . 12 ) 


Thus, if the new unbiased estimate of the spectral density is defined 


as 


Sx(“) — £ Rx{™&*) e 


—lujm&t 


2tt 


in terms of the estimated and extrapolated autocorrelation values, then 
equation (12.12) may be solved to yield 


where 


and the /i’s are 


5x(w) = 


/V— 1 

E a m R x {mAt) 

m=0 


2 Ur 


N 

1+ E 

m= l 


am ~ { -/ 

given by the solution of the matrix equation 


(m = 0) 

-h m (m = 1,2,..., N) 


(12.13) 



[h x l 


■ Rx( 0) ' 


>12 


W) 

A 

: 

— 




.R x [(N-l)At\. 
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X(t) 



x 0 (t) 


Figure 60 . MLM filter. 

Maximum likelihood method (MLM). The maximum likelihood 
method was introduced by Capon. 26 The basic idea is to develop a 
minimum variance unbiased estimator of the spectral components of the 
time history by designing a filter at each particular frequency that passes 
that frequency undistorted and rejects all other frequencies in an optimal 
manner. 

Consider a record of length T of a stationary random process X(0- 
It is desired to estimate the power spectral density of this random 
process at a particular frequency uq. This can be done by designing an 
optimal causal filter for the particular frequency as shown in figure 60. 
Note that the filter impulse response need not be restricted to be real. In 
this case, the output Xo(0 can also be complex. The spectral estimate 
is then the power output by this filter: 

S X (a;o) = l* 0 ( 0 l 2 

where if the impulse response has duration TV, 


fTr 

X 0 (t)= / h^(a)X{t-a)da (12.14) 

Jo 

is the time history passed by the filter. 

In order for the filter to be optimal, it should have unit gain at the 
frequency of interest, that is, 

H u,o(^o) = [ Tr h^ 0 (t) e -^dt = 1 
Jo 

Further, in order to reject other frequencies in an optimal manner, the 
filter should minimize the output power when the input process has an 
autocorrelation that agrees with the known data over the range (— T, T). 
By equation (12.14), the average output power of the filter is 


E {l-Xo(*)l 2 } = f d<*i f ^2^wo( a i)*w 0 ( a 2)^x( Q: 2-Q:i) 
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Minimizing this expression subject to the constraint that — 1 

yields 

WnKota) = ^{|X 0 («)| 2 }Q(ri,r 2 ) (12.15) 

where 

[ T Q(t,T)R x (T-t') dr = S(t-t') 

JO 

Now, multiply equation (12.15) by e - ‘ w o(n-'s) and integrate over r x 
and T2 to obtain 

E {|X 0 (t)| 2 } £" dr x jf dvi Q(n, n)e-^ r ^ 

= J q dr x /iwo(ri) e _,w ° ri d-^ A^,fo) e IW ° TJ = 1 
Thus, since the spectral estimate is taken to be the output power, 

dT X J^ dT >2 Q(n,T2) (12.16) 

where Q(t, r) may be estimated from the autocorrelation estimate by 
[ Q(t,r)Rx(T—t') dr = 6(t—t') 

Jo 

for t > 0 and tf <T. For discrete data, this can again be written as a 
matrix equation. 

Although these techniques provide higher resolution, they do so at 
the cost of increased computational effort. Basically, an additional 
matrix equation must be solved. Other techniques could, of course, 
be developed. Recall that 

Rx(r) = H SxM^du 

J —OO 

The data Rx{nAt) for n = 0, 1,2, — 1 provide a set of N 
constraints 

R x (nAt) = f°° S*(w) « <wnAt dw 

J— OO 

on the possible form of the power spectral density. Many methods for 
estimating Sx{u) within these constraints might be devised. 
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12.4 Spectral Analysis of Randomly Sampled Signals 

All the digital analysis that has been covered so far in this mono- 
graph has assumed that the time history is sampled at equal intervals 
At. However, in recent years, certain applications have stimulated inter- 
est in data sampled at random intervals. For example, the laser Doppler 
anemometry technique, which is used to measure velocity components 
in flows, amounts to setting up a control volume within the flow by 
crossing laser beams from different angles. The flow is then seeded 
with some particulates and a velocity measurement is obtained when- 
ever a particle happens to pass through the control volume. Spectral 
estimates obtained from such data are comparatively free from aliasing 
but have higher variability than corresponding estimates from equally 
spaced data . This is because the unequally spaced sample times elimi- 
nate the ambiguity associated with equally spaced samples that leads 
to aliasing, while the uncertainty of the random sample times leads to 
further uncertainty in the estimate. 

In both methods to be developed, the sampling times are assumed 
to be Poisson distributed 5 such that 


P{Sample in time interval (£, t + At)} = A At + o( At) 


where A is the average rate at which samples occur and o(At) indicates 
a term that approaches zero faster than At does as At — * 0. In the 
absence of any better information, this model is preferred in the sense 
that it fits many real world phenomena where events occur randomly 
in time. 


Method L This method was developed by Gaster and Roberts 27 in 
1975. Consider a stationary random process X(t) that is sampled at 
Poisson distributed random times t t for z = 1,2,... and define 


C(») = £{*(*, )X(t i+n )} 


Note that this is similar to an autocorrelation, being the expected 
value of the product of a sample with another sample n samples later. 
However, here the expectation is not only over the ensemble comprising 
the random process X{t) but also over the random sampling times t x . 
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From the concept of conditional probability, 5 


C{ n ) — [ £* {X(£t)^Y(t t -^ n )l^i+n ~ r } ^{^i+n “ T } dr 

Jo 

= r Rx(r)Pn(r) dr (12.17) 

Jo 


where fi'd) is the conditional expectation and p n (r) is the probability 
density function for the time interval between the zth and (i + n)th 
samples 

\n_n— l g -Ar 

Pn(r) = —7 TTf— (r>0;n>l) (12.18) 

(n - 1)! 

which is called the gamma density function. 5 
Recall that 

Rx(r) « r S x (u)e Mr du 

J — OO 

Then equation (12.17) may be written 

C{n) = [ Sx{u)<t> n {u)duj (12.19) 

J — oo 

where 

Thus, equation (12.19) becomes the integral equation 

C{n) = J Sx{u) (^ x^iu ) ^ (12.20) 

for the power spectral density Sx{u). 

Integral equation (12.20) has been solved for Sx(<+>) by Shapiro and 
Silverman. 28 The unique solution is 

s oo 


L L 


Sx(u) = r H b(n)i> n (u) 


(12.21) 
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where 

*•<“> “ R ' [-^ (iu — A) n 

«») = i/f x; l (-2)*( n ' 1 )<r(t+i) 

k=0 

and 

\kJ fc!(n — k)\ 

is the binomial coefficient. 

In practice, with N data points X{t x ) for i = 1,2 , ...,N from a 
single sample function, the function C(n) is estimated by 

, /V— n 

<?(") = E X(ti)X(t i+n ) (12.22) 

1*1 


which can be seen to be an unbiased estimate. This estimate is then 
used in equation (12.21) to yield the spectral estimate Sjv(u;). 

Method 2. In a later analysis (1977), Gaster and Roberts 29 
estimated the spectral density more directly. Recall the discrete Fourier 
transform spectral estimate (eq. (7.15)): 

(12.23) 

where 

' WXWe-^'dt (12.24) 

Now, if N randomly sampled data points X[t x ) for i — 1, 2, . . . , N oc- 
cur in the interval (0, T), an approximation to integral equation (12.24) 
may be obtained from 

1 * 

^ At, (12.25) 
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where A t{ = t, - t,_i and to = 0. Further, if samples occur at the 
average sampling rate A, then 


At, = 


l 

A 


and equation (12.25) may be further approximated by 

mxw-** i 12 - 26 * 

1=1 

This expression could be used in equation (12.23) to provide the spectral 
estimate. However, upon doing so, Gaster and Roberts found this 
estimate to be biased by a false constant shift, which they removed 
by the use of the estimate 


= 


Ws 

47r^A 2 


N 




-iwt t 


i=l 


£y(t,)x 2 (t,) 


i=i 


(12.27) 


Although this technique is computationally more efficient than 
method 1, more data are necessary to achieve the same level of 
accuracy. 


12*5 Cepstrum Analysis 

In the past few years, the use of cepstrum analysis 30 has come into 
prominence. The name is derived by inverting the first four letters in 
spectrum. This type of analysis is particularly useful for time histories 
involving a signal that is delayed and then added to itself, such as an 
echo, or for noise transmission by different paths, such as airborne and 
structure-borne sound. If X(t) is a stationary random process, then its 
power cepstrum is defined by 


Cp(r) = f ln[SxM]c lwT dw 

J — oo 


(12.28) 


which is the Fourier integral transform of the natural logarithm of the 
power spectral density. Since the power spectral density is reed and an 
even function of frequency, the power cepstrum can be seen to be real 
and an even function of the variable r. The reason for the use of the 
logarithm here is that any product term in the power spectral density 
appears as a summation in the cepstrum. 

The definition in equation (12.28) assumes that the power spectral 
density is never zero. When working with digital data, equation ( 12.28) 
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is approximated by the integral from — u/ c to u Cy where u; c is the Nyquist 
frequency. For real world random data, the occurrence of an identically 
zero value in this range is unlikely. 

Although the idea of inverting the first four letters in spectrum 
may have been a good one, this idea has been carried too far in 
cepstrum analysis. For example, the variable r in equation (12.28) 
is called “quefrency,” paraphrasing frequency, which is unfortunate 
since r is a timelike variable. Clearly, if it were not for the logarithm, 
equation (12.28) would be just the autocorrelation and r would be the 
lag time. Other examples of this type of paraphrastic excess will be 
noted. 

It is also possible to define the complex cepstrum to be 


C c (r) = f°° \n[X{u)\ e^duj 
J -OO 

where is the Fourier integral transform of X(t): 
= ^J°° XWe-^dt = \X{u)\e t,b ^ 
Note that, for X{t) real, 


(12.29) 


(12.30) 


X(-v) = X m (oj) 


and thus 
and 


l*(u/)| = \X(-u) 


<j>(uj) = - ) 

Thus, from equation (12.30) 

/ OO 

[In \X{u)\ + z<£(u;)](cosu/t + i sinu;r) duj 

-oo 

/ OO 

[In |X(w)| cos ujt — <j>(u) sin wr] duj 

-OO 

/ OO 

[In |-Y(u>)| sin ut + cosu/r] du 

-OO 


+ t 


The second integral vanishes since it is an odd function. Thus, the 
complex cepstrum is real! However, it is not an even function of r. 
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Source 



Figure 61. Sound source above flat surface. 


A third type of cepstrum, the real cepstrum , 


CV(r)= f°° \*\X(u)\ 2 j UT du (12.31) 

J — oo 

is both real and an even function of r and may prove to be even more 
convenient for transient functions. 

As an example of the use of this type of analysis, consider the 
geometry shown in figure 61. An acoustic source is producing a 
stationary acoustic signal S(t), which is being received by a microphone. 
Sound can reach the microphone by the direct path of length l\ or by 
echoing off the surface, resulting in a path length of £ 2 +^ 3 * The pressure 
signal recorded by the microphone is thus 


p(t) = 


S(t-e x /c) S[t-(i 2 + ( 3 )/4 
tl h + h 


(12.32) 


where c is the speed of sound and a is the fraction of the incoming 
energy that is reflected by the surface. If the reflecting surface was not 
present, the microphone would record the signal 


<?(0- 


s{t-e x /c) 

e x 


(12.33) 


which is equivalent to choosing a = 0 in equation (12.32). 

From equation (12.32), the autocorrelation of the microphone signal 
is 


Rp(T) = E{P{t)P(t + T)} 

= 0Rs( t ) +~t[Ps( T - r o) + Rs( t + r o)] (12.34) 
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Figure 62 . Autocorrelation of microphone signal, 
where # 5 ( 7 ) is the autocorrelation of the source signal, 

1 a 2 


a 


and 

„ _ *2 + *3 ~ f 1 

HD = 

c 

Thus, the autocorrelation of the microphone signal appears as the sum 
of the three curves shown in figure 62, and separating the directly 
radiated sound from the echo would be very difficult because of the 
distributed nature of the autocorrelation Rs( T )- 

However, the power spectral density of the microphone signal is, 
from equation (12.34), 


Sp(uj) = J Rp(r)e tu,T dr = \j3 + 2-ycoscjq)]SsM (12.35) 

where Ss(w) is the power spectral density of the source signed S(t). 
Thus, the source and receiver signals are related by the standard lin ear 
system relation 

s P M = |tf(u,)| 2 s s M 

where the squared frequency response function of the equivalent linear 
filter is 

\H(uj)\ 2 — 0 + 27COSu>hd 
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Thus, if the propagation paths and reflection coefficient were known, 
the source spectrum could be recovered by 

5s(w) 0 + 2-rcoswn) 

Generally, the propagation paths and reflection coefficient(s) are not 
known, in which case equation (12.35) may be written 

Sp{uj) = /3SsM (l + “ cosuiro'j 


Thus, 


In 5 p ( w ) = In Sq(uj) + ln (^ i ) + In (l + ^ jcosu /7 p ) 


where Sq{u) = Ss( w )/^i is the power spectral density of the echo-free 
signal (eq. (12.33)), and the power cepstrum of the microphone signal 
is 


-r> 


roo r oo 

C p (r)=* / ln[Sp(u>)] e ,WT <iw = / ln[Sg(w)] e ,u ' T <iw 

y -oo 
roo 

+ i - ’ ' • f 


■ ln/ttj f e tur du + f In ^1 + ^coawroj e 1 **' 1 ' eiw ( 12 . 36 ) 

y — oo * — oo 


Here, the first integral is the power cepstrum of the echo-free signal and 
the second is 2tt6(t ) by equation (2.6). Further, for |x| < 1 


ln(l + x) »x- y + y- 


Since 


22 _ 2 (i?&) 


151 


l l i i. rr i : : 


i 


L li L L L L L L L li L' L l' L 



Introduction to Time Series Analysis 

whose magnitude is less than unity, the third integral in equa- 
tion (12.36) may be written 

J In ^1 + ^ cos uitqJ e luT du 

= ~jj oo COSuJT 0 « IUJT du (y ) J COS 2 u/TD e MT du> + ■ ■ ■ 

- (^) + «(r+Ttj)] 

~ 5 (' j) * [26{t) + 6( < r ~ 2 ^ + %+2n>)l + • • • 

upon noting that cos 2 x = 1/2(1 + cos 2x) and using equation (4.7). 
Thus, the power cepstrum of the microphone signal is 

C p (r) rn Cq(t) + ir ^ ^ <^( T ) 

+ * [(t) h — j ^ r-T ^ + < *( r+T t>)l 

+ " ( 6 < r - 2T 0) + ^ r+2 ^)]+ -- (12.37) 

where Cq(t) is the power cepstrum of the echo-free signal. Further, it 
can be shown that the coefficient of the S(r) term is identically zero. 
Thus, the power cepstrum is as shown in figure 63, where the echo 
shows up as delta functions (called “rahmonics" ) at multiples of the 
delay time tq. 

This cepstrum can theoretically be readily filtered (or “liftered") by 
interpolating the continuous function at the positions where the delta 
functions occur leaving only the echo-free receiver cepstrum. The echo- 
free receiver spectrum is then recovered by inverse Fourier transforming, 
that is, 

Sq(u) = exp f Cq(t) e -lwT dr (12.38) 

It should be mentioned, however, that programming the liftering oper- 
ation may be difficult, particularly when more than one echo is present. 

12.6 Zoom FFT 

In recent years, the manufacturers of stand-alone spectral analyz- 
ers have developed a new feature called the zoom FFT/ 1 which allows 
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Figure 63. Power cepstrum of microphone signal. 


one to greatly increase the resolution in a small portion of the spec- 
trum. However, although the technique involves clever hardware im- 
plementation, it does not violate the fundamental resolution constraint 
(eq. (7.33)) that 

|Aw|>y 

where T is the length of the record. 

The method involves two assumptions: 

1. That data storage is limited by the memory of the spectral analyzer 
to N real data points 

2. That access to additional data is essentially unlimited 

Consider N data points x(nA*) for n = 0, 1, 2, . . . , N — 1. The FFT of 
these data is 

iV-l 

X{kAu) = x{nAt) e~' 2irkn/ ‘ w (12.39) 

n=0 

where A uj = 2 tt/N A t. Thus, the bandwidth of the FFT must satisfy 
the fundamental constraint 




If N is fixed by memory restrictions, the only way to increase the reso- 
lution (i.e., reduce Au) is to increase A t. However, increasing A t would 
lower the Nyquist frequency and introduce aliasing into the spectral es- 
timate. The way out of this dilemma is to use the old technique that 
electrical engineers call heterodyning. That is, multiply the signal by 
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Figure 64. Interesting region in spectral estimate. 


another signal to generate sum and difference frequencies, filter out the 
sum frequencies, and analyze only the difference frequencies. These fre- 
quencies are low and can thus be sampled at a much larger A t without 
introducing aliasing. However, since the record length is T = N At, the 
record length must increase in order to obtain the same number of data 
points at the larger At. 

The implementation of the zoom FFT technique involves the follow- 
ing steps. From the FFT (eq. (12.39)) of the original N data points, the 
spectral density may be estimated in the region 0 < w < w c = N/2 Aw 
as shown in figure 64, where Aw = 2n/NAt. Suppose that the range 
ui — l Aw < w < w u = u Aw is of particular interest. Then, let 

l + u 
m — 

and multiply the original data X(nAf) by the complex exponential 
® obtain the series of complex data points 

z(nAt) = e -*2jrmn/lV I (nAt) 

The FFT of this new series would be 

/V-l 

Z(kAu) = z{nAt)e~ i2 * kn / N 

n= 0 
/V-l 

= £ x(nAt)e~ ,2ff(/c+m > n / ;v 
n»0 

= X[{k + m)Aw] 

Thus, the power in the random process Z(t) at the frequency of 
zero is precisely the power in the random process X{t) at the frequency 
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|X(W)| |Z(C0) 



Figure 65. Change in frequency origin. 

|H(W)| 




1 


02-m] 

)Aco 

(u-m)Aco 


Figure 66. Low-pass zoom filter. 


m Au/ and the frequency origin has effectively been moved as shown in 
figure 65. 

Then, suppose that the random process Z(t) is passed through a low- 
pass digital filter with frequency response function as shown in figure 66 
to yield a new random process Y (t) containing only low frequencies such 
that H < (u — m)AcL/. Since Y{t) contains only low frequencies, the 
Nyquist criterion (eq. (9.7)) then says that the data are being sampled 
more frequently than necessary. In fact, one may now use a new A£ 
given by 


AW = 


7T _ iV 

(u — m) Aw 2 (u — m) 


A t 


(12.40) 


For example, if N = 128 and u - m = 2, then A£ n ew = 32 A t and the 
same information cam be obtained by keeping only every 32nd vadue of 
Y(t). 

Suppose one keeps only every Af n ew data point and still fills up the 
memory. The memory now holds N / 2 complex data points y(nA£ n ew) 
for n = 0, 1, 2, ... , N/2 - 1. The FFT of these data 


Y (fcAu; ne w ) — 


yV/2-1 

£ 


n=0 


y(nAt new ) e ~ i2 * kn /(N/2) 


155 



L 


y 


I 


y 

k 


1 : 


L 


•» i 

L A 


U li 11 L r L L L L L li II L* U 1 



Introduction to Time Series Analysis 

then yields a spectral estimate only in the region of interest with 
resolution 


Aa/ n ew — 


2 7T 


4(u — m) 


Au/ 


(iV/2)Atnew N 
Thus, for the same example, N = 128 and (u - m) = 2, 


(12.41) 


Au; nC w — -jj 

and the resolution is increased by a factor of 16. However, the total 
record length required to do this is 

N N 

i-Ai n ew = Tf ?(N At) 

or 16 times more data for the example cited. 


12.7 Digital Spectral Analysis of Periodic Signals 

Often one is interested in analyzing a periodic signal, which may 
or may not be contaminated by random noise. For example, the noise 
produced by a helicopter is largely due to the periodic motion of the 
rotor. However, there are also other sources of helicopter noise that are 
more random. In this case, many of the problems seen in the analysis 
of random data, as well as some new ones, arise. 

Recall that if f(t) is a periodic signal with period p, then it may be 
represented by the Fourier series (eq. (2.4)) 

oo 

/(*) = y + (On cosuV + 6 n Sin uj n t) 

1 n=l 
00 

= Fne iunt (12.42) 

n=-oo 


where u n = 2irn/p are harmonics of the fundamental radian frequency 
<jj\ = 2 ir/p of the signal. The Fourier integral transform of this signal 
is (eq. (2.7)) 

OO 

F{<jj) = FnS(0J-'jJ n ) 

— oo 
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Figure 67. Amplitude spectrum of periodic signal. 

Such harmonic analyses are ordinarily represented by the amplitude 
spectrum as shown in figure 67. In terms of the more familiar a s and 
6’s, 



Note that this is a much different representation from a power spectral 
density since |F n | 2 w the finite power in the signal at the discrete fre- 
quency u n . One is ordinarily interested in obtaining accurate estimates 
of these amplitudes \F n \ and the frequencies u n . 

For the digital analysis of such data, it might intuitively be expected 
that one ought to analyze data of length 

T = up (12.43) 

where u is an integer. That is, the data length ought to be some whole 
number of periods of the signal. Equation (12.43) can, in fact, be shown 
to be true mathematically. 

Suppose that the periodic signal is sampled at equal intervals At 
yielding data /(j At) for j = 0 t 1, 2, . . . , N - 1 for a total sample length 
of T = N At. The finite Fourier transform of this signal is 


iV-l 

F(*Aw) = £ /( J AO 

J=0 


e -i2xkj/N 


(12.44) 


which is evaluated at the frequencies 


u,* = JkAu, = ^ (* = 0,1,2 JV/ 2) 

The signal (eq. (12.42)) has power occurring at the frequencies 
u/ n = 2-irn/p. Thus, the frequencies at which equation (12.44) is 
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evaluated may not be the frequencies at which equation (12.42) has 
power! For n fixed, there is a A; such that = uj t n if and only if 

2t rk _ 27rn 



If one wishes this to be true for all n such that u n is less than the 
Nyquist frequency, u j c = ir/At, then 

T = up 

where u is an integer and the k corresponding to a given n is 

k = un 


Thus, the data length T must be some integer number of periods of the 
signal . 

Unfortunately, it is not always possible to satisfy the criterion in 
equation (12.43) that the data length be some integer number of periods 
because either (1) The period is not known a priori or (2) use of the 
FFT requiring N = 2* does not make it convenient. 

Then, what happens if one proceeds with the analysis when the 
frequencies at which the discrete Fourier transform is evaluated are not 
equal to the frequencies at which the periodic signal has power? This 
case has been studied extensively by Burgess. 32 

Suppose T = up where u is not an integer and data f(jAt) for 
j = 0, 1 , 2 , . . . , N — 1 from a periodic process f{t) exist. The discrete 
Fourier transform of these data is 


where 
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3 = o 


N-l oo 

-E E 


e -i2nkj / N 


j=Q n=— oo 
n=s— oo j=0 
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Figure 68. Square wave of amplitude A. 


since T = N <±t = up. Thus, defining 


J* o 

1 — gi27r(m/— fc) 
j _ e i2ir(m/— /c)/7V 


(12.45) 


and noting that division by N is necessary for equation (12.44) to 
approximate equation (2.5), the estimated Fourier coefficients are 


F(kAoj) 

N 


T. F r* w kn{ t') 


(12.46) 


which is a weighted sum of all the Fourier coefficients with respect 
to the very complicated weighting function (eq. (12.45)). Physically, 
w kni u ) represents a transfer of that power which in the periodic signal 
would appear at the frequency u ; n = 2 nn/p to the estimated Fourier 
coefficient at the frequency u/fc = 2 nk/T. This phenomenon is called 
leakage and can result in substantial errors in both the amplitude and 
the frequency of the Fourier coefficients, basically dependent on how 
close nu ever gets to k . 

Periodic signals may also have problems with aliasing, if they contain 
power at frequencies higher than the Nyquist frequency, just as in 
random signals. In fact, aliasing is much more evident in periodic 
signals. A simple example that can be readily analyzed is to consider 
a square wave of amplitude A as shown in figure 68. The Fourier 
coefficients are given by equation (2.5): 
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F n m±£f{t)e- iu * t dt 

= - f P/ 2 e~' Wnt dt-- [ P e~ tuJnt dt 
P Jo P Jp/2 


Now, suppose that the signal is sampled at equal intervals At for 
one period, that is, N At as p, and that N is even. Then, by 
equation (12.44), noting that /( 0) = /(p/2) = 0 because of the jump 
discontinuity, 


iV-l 

F(*Aw) = ]T fti&t) e~ i2Tk ^ N 
j- o 

N/ 2-1 jV-l 

= A £ ^ 

j*iV/2+l 


= iA[(-l)* - l] cot “r 


Thus, the approximate Fourier coefficients are 


£, _ F(nAu) iA , . irfr 

F " ■ - — • Tv [(-1) -‘'“‘iv 


- $(-ir - 1 ] 


tf 2 “lflal 

,«r (2*)! 


/7rn\ 2*-l 

V aTJ 


upon expansion of the cotangent. Here £ 2 * is a Bernoulli number. 
Note that the first term is the exact Fourier coefficient and the 3 um 
represents aliased terms. Thus, periodic signals must also be filtered to 
avoid aliasing. 


12.8 Spectral Analysis of Nonstationary Random Processes 

Practically all the analysis discussed thus far in this monograph 
has assumed that the random process of interest is at least weakly 
stationary. When this is not true, two major problems arise: 

1. Since the statistics of the random process vary with time, time 
averages cannot be used to reduce variability. Thus, an ensemble 
of sample functions must be collected, analyzed, and averaged. 
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2. Since second order moments depend on two variables instead of one, 
interpretation is more difficult. 

For example, the autocorrelation of a nonstationary random process 
X(£), Rxihy h), depends on two variables t\ and £ 2 - The corresponding 
power spectral density may be defined as 

S X ^) = ±j dt l J 

= E{X*( Wl )A-(u;2)} (12-47) 


where 

= h S^oc X[t)e ~ xultdt 

The autocorrelation may then be recovered by the inversion relation: 
Rx(h .12)=/ dwi [ dw2 5x(^i,u;2)e _l(u ' ltl-u ' , ‘ 2) (12.48) 

7-oo 7— o o 

However, the power in the process at time £ is 

E[X 2 (t)] = R x (t,t) = f°° du x f°° du 2 S x {ux,u 2 )e- t ^-^ )t 

7—oo 7—oo 

Thus, the power spectral density (eq. (12.47)) does not admit a simple 
interpretation in terms of power per unit frequency. If the random 
process were stationary, then Sx(u>i,u; 2 ) = Sx(wi)$(u/ 2 --u/i). Thus, 
it has been suggested 3 ^ that the spread of the values of the spectrum 
Sx{u\,u>2) about the line u>i = u >2 is a measure of the nonstationarity 
of the random process. One virtue of the definition in equation (12.47) 
is that if X(£) is input to a linear system, the power spectral density of 
the output Y{t) is given by 

Sy(u/i,u; 2 ) = H m {u\)H {(^2) Sx{^ 1^2) 


However, although this definition is relatively straightforward, it has 
proven to be of limited practical use. 

A more useful definition is accomplished by introducing the variables 


z h + h _ 

t = T = £2 “ £l 
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called the mean time and time difference. Then, the autocorrelation 
may be written 


RxihM) = Rxit-T/^t+rfi) 
and one can define the time varying power spectral density as 

Sx(l“) = ^ J^Rx(t-r/2,t+T/2)e- MT dr (12.49) 

with inverse 

Rx(i-r/2,t+r/2) m f S X {1 w) e'“ T du (12.50) 

Although this definition does not satisfy a simple relation for a linear 
system, the power is 


£{X 2 (t)} = Rx(hi) = r Sx(i,u)du 

J — OO 

Thus, S(t,u>) admits interpretation as the power per unit frequency in 
the signal at time t. This definition has proven useful, probably being 
most widely used in the field of voice analysis and identification. 

A class of nonstationary random processes that have been fairly 
widely studied are called “pseudostationary” random processes. Here 

Y(t) = A(t)X(t) (12.51) 

where X(t) is a stationary random process and A(<) is a deterministic 
modulation signal that is assumed to vary much more slowly than 
X(t). For example, figure 69 presents the acoustic pressure time history 
measured by a stationary microphone as an aircraft flies over it. Such 
data can be represented by equation (12.51), although the Doppler shift 
in frequency inherent in this measurement technique must also be taken 
into account. For processes that can be described by equation (12.51). 


R Y (i-T/2,t+r/2) = E{Y{1 -t/2)Y(1+t/2)) 

= A{i-T/2)A(t+T/2)E{X(i-r/2)X(i+T/2)} 

* A 2 (i)Rx(r) 


and thus 
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Chapter XII Special Topics 



Figure 69. Acoustic pressure time history produced by aircraft flyover. 


X(t) 



where Sx(w) is the ordinary power spectral density of X(t). 

Another type of nonstationary signal that seems to occur fairly 
frequently is shown in figure 70. This type might be called identifiably 
nonstationary. In such signals, the time history consists of two (or 
more) regions where clearly different phenomena are occurring. Such a 
record might be produced, for example, by a velocity sensor mounted 
on an aircraft flying in and out of thunderstorms, by a microphone 
measuring the noise level inside a train traversing sections of smooth 
and rough track, or by a seismometer recording periods of more and less 
seismic activity. For such records, it seems reasonable to break the time 
history into blocks corresponding to the different regions and analyze 
the records in the blocks from each region as if they were produced by 
a single stationary random process. 

While such an approach undoubtably produces useful information, 
it requires much engineering judgment. First, the various regions must 
be identified and the break points between them determined. Then, 
the reasonability of treating the sections of a given region as stationary 
must be evaluated. For this purpose, the test for stationarity given in 
chapter VI is useful. Sometimes the means may not pass such a test, 
although the variances do. In this case, the mean of each block may be 
subtracted from the data in that block before analysis. Other times, 
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Introduction to Time Series Analysis 

detrending, as was discussed in chapter XI, may have to be applied to 
each individual block before the stationarity test is satisfied. 

Assuming satisfactory passage (which again requires engineering 
judgment), spectral estimates for each block can then be computed and 
averaged together, just as in block averaging, to reduce the statistical 
variability. However, the blocks may be so short that the needed 
resolution cannot be obtained with standard techniques. In this case, 
it may be necessary to apply techniques such as the maximum entropy 
or maximum likelihood methods. 

In analyzing such data f one should use good engineering judgment at 
every step along the way and should always be mindful in interpreting 
the final results that many assumptions have gone into the analysis . 

A way in which one can make any random process appear more 
stationary is to “normalize” it. Suppose X(t) is a nonstationary random 
process with mean mx{t) and variance <r£(£). Then, the new random 
process 

m l\ . X[t) m x{t) 

z(,) " - Vx«> 

is such that E{Z{t)} = 0 and = 1- Thus, Z(t) is much closer to 
being stationary than X(£) was. Further, X(£) may be written 

Xit)-m x (t)+*x(t)Z{t) 

which is just the sum of a deterministic signal m^ft) plus a random 
process that looks something like equation (12.51). This type of analysis 
has often been applied to transient phenomena such as vibration during 
a Space Shuttle liftoff or noise measurements during an exhalation of 
breath, where mx(t) and <7^(0 may be estimated by ensemble averages 
over repeated experiments. 

The analysis of nonstationary processes is not in a very satisfactory 
state and may, in fact, never be, although they are the topic of consider- 
able current research. The difficulty lies in the fundamental importance 
of the assumption of stationarity in the analysis and interpretation of 
random data. Basically, the state of the art is that one tries to make the 
nonstationary signal look enough like a stationary signal that station- 
ary techniques may be used. Recently, 34 this approach has been placed 
on a firm foundation by a unified theory that considers more general 
types of invariance under transformation in addition to independence 
of the origin of time, which led to the concept of stationarity. 
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